Mar 21, 2015

SSL handshake errors can occur due to various reasons such as Self Signed certificate, unavailability of protocol or cipher suite requested by client or server, etc.  Recently I faced this issue where I was connecting to third party server using HttpClient library.  Here’s what I did to identify the cause:-

Firstly, I enabled the debug flag for SSL,handshake and failure on  javax.net packages.

-Djavax.net.debug=ssl,handshake,failure


On examining the logs, I could see that the third party site was expecting a cipher key of 256 bits and the only supported keys in my glassfish server were of 128 bits length.  As it happens,  this occurs because OOTB java 6, 7 or 8 support only 128 bit encryption keys. To enable 256 or higher bit key length , you need to download the Java Cryptography Extension (JCE) Unlimited Strength Jurisdiction Policy Files which essentially contains two jars i.e US_export_policy.jar and local_policy.jar and place them in <JRE_HOME>/lib/security/ directory and restart the server to enable higher bit encryption keys.



The above step will enable  256 bit or higher bit encryption keys and will ensure that you do not face SSL Handshake errors due to key strength.



You can download the Policy files from the following links.



JCE Unlimited for java 6



JCE Unlimited for java 7



JCE Unlimited for java 8

Posted on Saturday, March 21, 2015 by raman nanda

If you use ProGuard for obfuscating your code and happen to use Retrofit in your application, you will need to configure ProGuard to exclude certain Retrofit files from being obfuscated. Also you must note that if you are using GSON for conversion from JSON to POJO representation, you must ignore those POJO classes from being obfuscated, this is required as if those POJO class  field names are obfuscated, conversion to POJO’s from JSON would fail because POJO  field names are inferred from JSON response.   So to keep it brief you should use the following configuration.

-keep class com.squareup.** { *; }
-keep interface com.squareup.** { *; }
-dontwarn com.squareup.okhttp.**
-keep class retrofit.** { *; }

-keepclasseswithmembers class * {
@retrofit.http.* <methods>;
}

-keep interface retrofit.** { *;}
-keep interface com.squareup.** { *; }
-dontwarn rx.**
-dontwarn retrofit.**


#Here include the POJO's that have you have created for mapping JSON response to POJO for example
com.blogspot.ramannanda.apps.xyz.FeedlyResponse {*;}


Here FeedlyResponse is just a POJO class that maps to JSON fields returned by Feedly feed search API.

Posted on Saturday, March 21, 2015 by raman nanda

Mar 4, 2015

I recently reviewed this title and found it short on a few important consideration such as cross client authorization. This is definitely a book for developers who are beginning android application development, but isn’t comprehensive.

I discuss about what each chapter covers and then offer suggestions later on how this book can be improved further.

Chapter 1: Android Security Issues

  1. Talks about the different security compliance standards 
  2. What are the common problems in android applications
  3. How one can easily re-engineer your applications code.

Chapter 2: Protecting your code
Here the author talks about why you should obfuscate your code. It starts by explaining how easy it is to re-engineer the code, if the code is not obfuscated. Obfuscation tools are then covered to show how to obfuscate your applications code. The author then talks about disassemblers to show that even though obfuscation might deter someone from looking at your code, It might not truly prevent someone from hacking your application code.

Chapter 3: Authentication
Here the author talks about different authentication schemes username/password, facebook login etc.

Chapter 4: Network communication
Talks about asymmetric public key encryption, Why you should use SSL security and demonstrates the Man in the middle attack. It also explains why your application should validate ssl certificates.

Chapter 5: Databases
Talks about general database best practices such as encryption and preventing SQL injection.

Chapter 6: Web Server Attacks
Talks about securing web services, XSS attack etc. Here, I feel the author should have covered authentication and authorization challenges that one usually faces with android applications, as one generally needs to implement validations of requests from mobile devices. For example, A user can easily know your service endpoint as the code is deployed on the client side and send a request to that URL from their application as well, So you need to differentiate between the request from your application and other applications.  (I personally use Google plus sign in API's, along with server side token validation to ensure that any back-end requests are originating from within my application and are from the correct individual)

Chapter 7: Third party library integration
Mentions that you should be aware of the permissions that you are granting to the third party libraries.

Chapter 8:Device Security
Talks about device security issues and why you should enable encryption. It then talks about how device security is enforced on Kitkat. The author then discusses some android version specific exploits and offers certain solutions.

Chapter 9: The Future
This chapter covers Intent hijacking and how to deal with it in your android application. The chapter then covers devices such as android wear and the extended ecosystem of android devices and its impact on security considerations. Furthermore, the chapter covers tools which expose security vulnerability in your application.

Conclusions:
The book covers a lot of common security vulnerabilities that developers expose while writing the android applications and has a lucid prose and demonstrates these vulnerabilities practically by showing examples. It also offers solutions to those problems. For a developer who is beginning application development with android, having knowledge about these issues is important.  However, most of these issues would be known to experienced developers. I feel the detailed coverage of topics such as securing back-end services unobtrusively, OAuth, OWSM,  etc could have added value to the book. Maybe its just me but I expect these topics to be covered in detail, as most of the android applications would be using some form of back-end service to offload heavy processing. I rate it 3.5 for the content it has covered. 

Posted on Wednesday, March 04, 2015 by raman nanda

Feb 26, 2015

In this post, I will explain how you can use apache chemistry API’s to query the enterprise content management systems. Apache Chemistry project provides client libraries for you to easily implement integration with any of the content management products that support or implement CMIS standard. As you might be aware that there are multiple standards such as JCR and CMIS  for interacting with the content repositories. Although with JCR 2 support for SQL-92 is now available,  I really prefer the conciseness and wider adoption of the  CMIS standard, and the fact that Apache chemistry API’s really make it easy to interact with the content repository. 

I am sharing an example that you can readily test without any special environment setup.  Alfresco, is one of the vendors that provides a ECM product, which supports the CMIS standard. Alfresco, offers a public repository for you to play with.  I will cover this example on a piecemeal basis.

  1. Connecting to the repository
    SessionFactory sessionFactory = SessionFactoryImpl.newInstance();
    Map<String,String> parameter = new HashMap<String,String>();
    parameter.put(SessionParameter.USER, "admin");
    //the binding type to use
    parameter.put(SessionParameter.BINDING_TYPE, BindingType.ATOMPUB.value());
    parameter.put(SessionParameter.PASSWORD, "admin");
    //the endpoint
    parameter.put(SessionParameter.ATOMPUB_URL, "http://cmis.alfresco.com/s/cmis");
    parameter.put(SessionParameter.BINDING_TYPE, BindingType.ATOMPUB.value());
    //fetch the list of repositories
    List<Repository> repositories = sessionFactory.getRepositories(parameter);
    //establish a session with the first ?
    Session session = repositories.get(0).createSession();


    To connect to the repository, you require a few basic parameters such as the username, password, endpoint url (In this case the REST AtomPub service) and a binding type to specify which type of endpoint is it (WEBSERVICES, ATOMPUB, BROWSER, LOCAL, CUSTOM ) are the valid binding types. After we have this information , we still need a repository id to connect to. In this case, I am using the first repository from a list of repositories to establish the session. Now, let’s create a query statement to search for the documents.






  2. Querying the repository: 
    //query builder for convenience
    QueryStatement qs=session.createQueryStatement("SELECT D.*, O.* FROM cmis:document AS D JOIN cm:ownable AS O ON D.cmis:objectId = O.cmis:objectId " +
    " where " +
    " D.cmis:name in (?)" +
    " and " +
    " D.cmis:creationDate > TIMESTAMP ? " +
    " order by cmis:creationDate desc");
    //array for the in argument
    String documentNames[]= new String[]{"Project Objectives.ppt","Project Overview.ppt"};
    qs.setString(1, documentNames);
    Calendar now = Calendar.getInstance();
    //subtract 5 year for viewing documents for last 5 year
    now.add(Calendar.YEAR, -5);
    qs.setDateTime(2, now);
    //get the first 50 records only.
    ItemIterable<QueryResult> results = session.query(qs.toQueryString(), false).getPage(50);


    Here I have used createQueryStatement  method to build a query just for convenience, you could also directly specify a query string(not recommended). The query is essentially a join between objects. This sample code shows, how to specify the date (Line 14) and an array (Line 10) for the in clause as parameters.  Line 16 assigns the searched values to an Iterable interface, where each QueryResult is a record containing the selected columns.




  3. Iterating the results:

    for(QueryResult record: results) {
    Object documentName=record.getPropertyByQueryName("D.cmis:name").getFirstValue();
    logger.info("D.cmis:name " + ": " + documentName);

    Object documentReference=record.getPropertyByQueryName("D.cmis:objectId").getFirstValue();
    logger.info("--------------------------------------");
    logger.info("Content URL: http://cmis.alfresco.com/service/cmis/content?conn=default&id="+documentReference);
    }

    As explained above, we get a Iterable result-set to iterate over the individual records. To fetch the first value from the record (as there might be multiple valued attributes), I am using the getFirstValue method of the PropertyData interface.  Note Line 7 as it contains the actual URL of the resource, which is just a base URL to which the object id of the matched document is appended.


  4. Closing the connection ? As per the chemistry javadoc, there is no need to close a session, as it is purely a client side concept, which makes sense as we are not holding a connection here.



Viewing the results: To view the actual documents just use the URL’s generated by the log statement in the browser.



Building the code: Add the following dependency to maven for building the sample.



    <dependency>
<groupId>org.apache.chemistry.opencmis</groupId>
<artifactId>chemistry-opencmis-client-impl</artifactId>
<version>0.12.0</version>
</dependency>


Wrapping up: I have just covered one example of the CMIS Query API  and Apache chemistry to query for the documents. Kindly refer to the documentation links provided in reference section for other usages.  Below, is the gist that contains the entire sample code.






References:


CMIS_Query_Language Java Examples for Apache Chemistry

Posted on Thursday, February 26, 2015 by raman nanda

Feb 20, 2015

Retrofit uses Google’s gson libraries to deserialize JSON representation to Java object representation. Although, this deserialization process works for most of the cases, sometimes you would have to override the deserialization process to parse a part of the response or because you don’t have any clear object representation of the JSON data.

In this post, I will share an example of a custom deserializer to parse the response from Wiktionary’s word definition API. First, let us take a look at the request and response.

The request URL is mentioned below:-

http://en.wiktionary.org/w/api.php?format=json&action=query&titles=sublime&prop=extracts&redirects&continue

The response is mentioned below, It has been shortened for brevity.

{"batchcomplete":"","query":{"pages":{"200363":{"pageid":200363,"ns":0,"title":"sublime","extract":"<p></p>\n<h2><span id=\"English\">English</span></h2>\n<h3><span id=\"Pronunciation\">Pronunciation</span></h3>\n<ul><li>\n</li>\n<li>Rhymes: <span lang=\"\">-a\u026am</span></li>\n</ul><h3><span id=\"Etymology_1\">Etymology 1</span></h3>\n<p>From <span>Middle English</span> <i class=\"Latn mention\" .......for brevity }}}}  


As, you can see the data we would be interested in is extract and probably the pageid. Now, as there is no straightforward object representation of this entire response in Java, so we would implement our own custom deserializer to parse this JSON response.



The code for the deserializer  is mentioned below.



public class DictionaryResponseDeserializer implements JsonDeserializer<WicktionarySearchResponse> {

@Override
public WicktionarySearchResponse deserialize(JsonElement json, Type typeOfT, JsonDeserializationContext context) throws JsonParseException {
Gson gson=new Gson();
JsonElement value = null;
value = json.getAsJsonObject().get("query").getAsJsonObject().get("pages");
WicktionarySearchResponse response = new WicktionarySearchResponse();
if(value!=null) {
Iterable<Map.Entry<String, JsonElement>> entries = value.getAsJsonObject().entrySet();
Query query = new Query();
ArrayList<ResultPage> resultPages = new ArrayList<ResultPage>();
for (Map.Entry<String, JsonElement> entry : entries) {
resultPages.add(new Gson().fromJson(entry.getValue(), ResultPage.class));

}
query.setPages(resultPages);
response.setQuery(query);
}


return response;
}
}


Pay special attention to the highlighted lines. On the first highlighted line, we are assigning the JsonElement with the value of the object that contains all the pages from the JSON response, as we are interested in only that data.  Next, we iterate the assigned value and as we are interested in the actual values and not the keys (as the key pageid is already present in the individual pageid objects), so we just use entry.getValue to obtain that and then transform it to a Java POJO instance using the GSON object instance.



Below, I have mentioned the service interface and an util class to invoke the word search API.



public interface DictionaryService {

@GET("/w/api.php")
public void getMeaningOfWord(@QueryMap Map<String, String> map, Callback<WicktionarySearchResponse> response);

@GET("/w/api.php")
public WicktionarySearchResponse getMeaningOfWord(@QueryMap Map<String, String> map);
}


/**
* Created by Ramandeep on 07-01-2015.
*/
public class DictionaryUtil {
private static final String tag="DictionaryUtil";
private static Gson gson= initGson();

private static Gson initGson() {
if(gson==null){
gson= new GsonBuilder().registerTypeAdapter(WicktionarySearchResponse.class,new DictionaryResponseDeserializer()).create();
}
return gson;
}

public static WicktionarySearchResponse searchDefinition(String word){
WicktionarySearchResponse searchResponse=null;
RestAdapter restAdapter = new RestAdapter.Builder()
.setEndpoint("http://wiktionary.org").setConverter(new GsonConverter(gson))
.build();
DictionaryService serviceImpl= restAdapter.create(DictionaryService.class);
Map queryMap=new HashMap();
queryMap.put("action","query");
queryMap.put("prop","extracts");
queryMap.put("redirects",null);
queryMap.put("format","json");
queryMap.put("continue",null);
queryMap.put("titles",word);
try {
searchResponse= serviceImpl.getMeaningOfWord(queryMap);
}catch (Exception e){
if(e==null&&e.getMessage()!=null) {
Log.e(tag, e.getMessage());
}
}
return searchResponse;

}




}


 



 



Below, I have mentioned the POJO classes. In order of hierarchy.



public class WicktionarySearchResponse {

private Query query=null;

public Query getQuery() {
return query;
}

public void setQuery(Query query) {
this.query = query;
}
}


public class Query {


public List<ResultPage> getPages() {
return pages;
}

public void setPages(List<ResultPage> pages) {
this.pages = pages;
}

private List<ResultPage> pages=null;


}


public class ResultPage {
private long pageId;
private String title;
private int index;
private String extract;

public ResultPage() {
}

public long getPageId() {
return pageId;
}

public void setPageId(long pageId) {
this.pageId = pageId;
}

public String getTitle() {
return title;
}

public void setTitle(String title) {
this.title = title;
}

public int getIndex() {
return index;
}

public void setIndex(int index) {
this.index = index;
}

public String getExtract() {
return extract;
}

public void setExtract(String extract) {
this.extract = extract;
}
}

Posted on Friday, February 20, 2015 by raman nanda

Jan 15, 2015

In the past few weeks, I had been working on an android application, for reading feed articles, there were quite a few takeaways from that experience and I am just sharing few of those, along with the link and features to the application. 

The app can be downloaded from the link: Simply Read on Play Store

Takeaways: -

  • Parsing is slow: As I had to design full text content scraper and parser for enabling the user to view the full content of articles, I soon realized the multiple iterations of parsing and scraping can be painfully slow.  The same thing that would execute in sub-second on local machine would take 30-40 seconds on an android device, which is unbearable for an end user.  Considering that my device was pretty fast, this could be attributed to performance snags with the standard java API implementation by Google.  Solution: Move the parsing code to the server, but provide the user an option to use the parser within the app,  create a rest service which returns the parsed article. can_encode_drill_down  can_encode_slow
  • Authenticating user requests:  This is an interlinked problem to the first one, when I had to move the parsing code to the server side, I needed to ensure that only authenticated user with the application send the parsing requests as I just could not allow everyone to query the backend and retrieve the parsed article.  Solution: Use Google+ sign in with server side validation of the client oauth tokens, this ensured that the requests originated from my application on an android device and the user was authenticated before making a parse request.
  • Not Using Content providers, So Managing Data Refresh: I chose not to use content providers and instead chose to go with the native API’s for fetching data from the SQLite database and doing the updates. The obvious problem that arises is to manage data refresh across different activities. Solution: I used otto coupled with the loaders to manage data refresh across activities. Using Otto ensured loose coupling of components.
  • ProGuard and Retrofit don’t gel well together:  There are quite a few standard exclusion rules that you would have to write to get retrofit to work with ProGuard.  Just make sure also to exclude classes and attributes that you are going to use with GSON to convert JSON to Object representation.  Here’s a snippet of the rules.
    -keepattributes Signature
    -keepattributes *Annotation*
    -keep interface com.squareup.** { *; }
    -dontwarn rx.**
    -dontwarn retrofit.**
    -keep class com.squareup.** { *; }
    -keep class retrofit.** { *; }

    -keepclasseswithmembers class * {
    @retrofit.http.* <methods>;
    }

    //now exclude the response classes and Pojo's
    -keep class com.blogspot.ramannanda.apps.simplyread.model.rest.SampleResponse {*;}




 



Although there are a lot of other takeaways. I am going to keep this brief and look forward to hearing your feedback about the app.



Posted on Thursday, January 15, 2015 by raman nanda

Dec 12, 2014

ListView is probably the most common android component, but it has to be implemented correctly to provide a  better user experience.

In this post, I will give a few suggestions on how you can achieve the near optimum list performance.  They are mentioned below:

  1. Access Data Incrementally:- Load only the data you need. I know this is a general principle to reduce memory footprint of your application, but even so, this can and must be done.  The ListView class can listen for scroll updates, you can implement the OnScrollListener interface to do so.  When the user scrolls to the bottom of the list, then load the incremental data. This can be done independent of the data source, whether it be a REST web service or the SQLite database. In this post, I am going to show you how you can handle this when your view is backed by the data from the database.  Take a look at this abstract class Endless Scroll Listener which implements the OnScrollListener. To integrate this with your data source and ListView all you  have to do is extend from this class as shown below.
      // The listener implementation
    listener= new EndlessScrollListener() {
    @Override
    public void onLoadMore(int page, int totalItemsCount) {
    // Triggered only when new data needs to be appended to the list
    // Add whatever code is needed to append new items to your AdapterView
    customLoadMoreDataFromApi(page);
    // or customLoadMoreDataFromApi(totalItemsCount);
    }
    private void customLoadMoreDataFromApi(int page) {
    //Progress dialog to show with loading text
    pd.setTitle(R.string.loadingText);
    pd.show();
    Bundle arg = new Bundle();
    //here pageSize is a constant
    //How many records to fetch
    limit = pageSize;
    //From which offset
    offset = (page - 1) * pageSize;
    //add these to the parameters that will be passed to the loader
    arg.putInt("limit", limit);
    arg.putInt("offset", offset);
    //add custom arguments

    //Instantiate or restart the loader
    Loader loader = getActivity().getSupportLoaderManager().getLoader(LOADER_ID);
    if (loader != null && !loader.isReset()) {
    getActivity().getSupportLoaderManager().restartLoader(LOADER_ID, arg, _self);
    } else {
    getActivity().getSupportLoaderManager().initLoader(LOADER_ID, arg, _self);
    }
    }
    };
    //set on scroll listener for the listview
    lstView.setOnScrollListener(listener);

    SQLite provides limit and offset clause, so when the user scrolls downwards, we restart the loader with the new offset parameter. Below I have mentioned the methods involved in loader callback implementations.

    @Override
    public Loader<Object> onCreateLoader(int id, Bundle args) {
    Loader loader= new SimpleAbstractListLoader(getActivity().getApplicationContext(),id,args);
    return loader;
    }
    @Override
    public void onLoadFinished(Loader<Object> arg0,
    Object arg1) {
    //here article list is an array list that contains the data
    //if it is initial just create a new list else append the data to the existing list
    if(articleList==null||adapter==null){
    //FeedArticleSummary is a POJO class
    articleList=new ArrayList<FeedArticleSummary>();
    articleList.addAll((ArrayList<FeedArticleSummary>) arg1);
    //Custom Adapter for the listview that extends the ArrayAdapter
    adapter = new ArticleListAdapter(this.getActivity(),
    R.layout.article_list_layout,articleList);
    lstView.setAdapter(adapter);
    }
    else{
    articleList.addAll((ArrayList<FeedArticleSummary>)arg1);
    adapter.notifyDataSetChanged();


    }
    //load finished so dismiss the progress dialog.
    pd.dismiss();
    }

    Now the loader class SimpleAbstractListLoader just needs to pass the limit and offset parameters to a database utility class which will apply limit and offset to the SQL Query to fetch the data. i.e rawQuery+whereClause+OrderBy + limitandOffsetClause.


    Here one thing should be noted that order by clause is not optional and is required as the data needs to be fetched in a predefined order so that the offset clause can fetch the correct data.




  2. Use a ViewHolder and know that views can be recycled:  A ViewHolder pattern avoids repeated calls to findViewById() which is used to lookup the views in layout.  It is just a static class  whose instance holds the component views inside the tag field of the Layout.  Secondly, List Views can be recycled, so its a best practice to reuse the views and populate them with new data,  instead of inflating them  again.  These two when combined together will make the scrolling of the list very smooth and fast.  Here the latter concept is even more important than the former, as inflating a view is a costly operation and should definitely be avoided.  So, if you are using ArrayAdapter as a data source for your list, this is how your getView() method should look like.

       @Override
    public View getView(int position, View convertView, ViewGroup parent) {
    ViewHolder holder;
    // Get the data item for this position
    //custom pojo which hold the reference to the data
    FeedArticleSummary articleSummary = getItem(position);
    if(articleSummary==null ){
    return null;
    }
    // Check if an existing view is being reused, otherwise inflate the view
    if (convertView == null) {
    convertView = mInflater.inflate(R.layout.article_list_layout, null);
    holder=new ViewHolder();
    holder.text=(TextView) convertView.findViewById(R.id.textViewArticleListFragment);
    holder.imageButton=(ImageButton)convertView.findViewById(R.id.favoriteImageArticleList);
    holder.checkBox= (CheckBox) convertView.findViewById(R.id.selected);
    //Here the holder is set as a tag field in the layout
    convertView.setTag(holder);
    }
    else{
    holder=(ViewHolder)convertView.getTag();
    }
    //set the data here
    //for example
    // holder.checkBox.setChecked(true);
    ....
    }


    And here is the ViewHolder class.

       static class ViewHolder {
    TextView text;
    CheckBox checkBox;
    ImageButton imageButton;
    }



  3. Do not set the layout_height and layout_width property to wrap_content:  Definitely avoid this because getView() will be invoked a number of times,  to determine the height and width for the views that are to be drawn. Instead use fill_parent or use fixed height and width, so that there are no unnecessary invocations of the getView() method.


  4. If you know which row has been modified call getView() instead of notifyDataSetChanged().  Often, a user might modify a single item in  a list and you might be required to make some UI modification for that item. For example, if a user selects a list item to view the details, you might change the font style of that particular listitem, to show that the item has been read. In this case it is much faster to just invoke the getView() for that item  rather than calling notifyDataSetChanged().

    @Override
    public void onListFragmentItemClick(ListView l, View view, int position,
    long id) {

    ListView lstView = l;
    //adapter reference
    ArticleListAdapter articleAdapter = (ArticleListAdapter) l
    .getAdapter();
    //accessor for the underlying list
    ArrayList<FeedArticleSummary> articleSummaries=articleAdapter.getArticleSummaries();
    FeedArticleSummary articleSummary = articleAdapter.getItem(position);
    if(articleSummary.getArticleRead()==null||articleSummary.getArticleRead().equals("N")) {
    //update the underlying data source for the list item
    articleSummary.setArticleRead("Y");
    int visiblePosition = lstView.getFirstVisiblePosition();
    View selectedChildView = lstView.getChildAt(position - visiblePosition);
    //update the data in the DB using a loader
    SQLiteCursorLoader loader = new SQLiteCursorLoader(this, fdb.getDbHelper(), null,
    null);
    ContentValues cValues = new ContentValues();
    cValues.put(FeedSQLLiteHelper.COLUMN_ARTICLE_READ, "Y");
    loader.update(FeedSQLLiteHelper.TABLE_ARTICLES, cValues, "_id = ? ", new String[]{articleSummary.getArticleId()});
    //call getview
    lstView.getAdapter().getView(position, selectedChildView, lstView);
    ...
    }

    and in your getView method handle the font style change

    ...  
    if(articleSummary.getArticleRead()==null|| articleSummary.getArticleRead().equals("N")){
    holder.text.setTypeface(holder.text.getTypeface(),Typeface.BOLD);
    }
    else{
    holder.text.setTypeface(holder.text.getTypeface(),Typeface.ITALIC);
    }
    ...




Note: Android has introduced RecyclerView, which forces you to use the ViewHolder as a best practice, this was not enforced earlier in ListView.  



I might have missed on few other optimizations, so kindly do provide your feedback.

Posted on Friday, December 12, 2014 by raman nanda

Oct 30, 2014

If you want to integrate Lucene with your android application, this post will get you started.  Lucene provides you with a wide range of searching options  like  Fuzzy Search, wildcard search, etc.  So, you can use this in your android application, if you want to provide search option over your custom data model.

In the code shown below searches will be near real time as I am passing IndexWriter instance to it, so IndexReader will be created using the passed IndexWriter instance.  Also, as creation of IndexWriter and SearcherManager is expensive, so the best place to initialize them is in the application class.

Initialization: The application class which initializes the IndexWriter and SearcherManager.

public class FeedReaderApplication extends Application {

public static final SearcherManager getSearcherManager(){return searcherManager;}
public static final IndexWriter getIndexWriter(){return indexWriter;}
private static SearcherManager searcherManager=null;
private static IndexWriter indexWriter=null;

@Override
public void onCreate() {
//pick the properties from user preferences
SharedPreferences preferences = PreferenceManager.getDefaultSharedPreferences(getApplicationContext());
Analyzer analyzer= new SimpleAnalyzer(Version.LUCENE_41);

IndexWriterConfig config=new IndexWriterConfig(Version.LUCENE_41,analyzer);
//pick the buffer size from property
String memorySize=preferences.getString("lucene_memory_size","5.0");
config.setRAMBufferSizeMB(Double.valueOf(memorySize));
config.setOpenMode(IndexWriterConfig.OpenMode.CREATE_OR_APPEND);
//create index on external directory under lucene folder
File path= new File(getApplicationContext().getExternalFilesDir(null),"lucene");
try {
Directory directory= FSDirectory.open(path);
indexWriter=new IndexWriter(directory,config);
boolean applyAllDeletes = true;
//no need to warm the search
searcherManager = new SearcherManager(indexWriter, applyAllDeletes, null);
} catch (IOException e) {
Log.e(tag,"Error occurred while opening indexWriter/SearcherManager"+ e.getMessage(),e);
}
}

}


Now, In the example application I am using Sqlite database to store the Feed data, but titles are being analyzed and stored in the lucene index. Also, I am using the SimpleAnalyzer rather than standard analyzer as the StandardAnalyzer does stop word filtering before storing the terms which is not going to work out for us as the user might search with stop words and find no matches.



public class LuceneSearchUtil {
private static final String tag = LuceneSearchUtil.class.getName();

public LuceneSearchUtil() {
}
//insert articles id,title and feedid
public static void insertArticleDocument(ContentValues contentValues) {
try {
IndexWriter writer = FeedReaderApplication.getIndexWriter();
Document document = new Document();
//don't analyze id field, store as such
Field idField = new StringField(FeedSQLLiteHelper.COLUMN_ID, String.valueOf(contentValues.get(FeedSQLLiteHelper.COLUMN_ID)), Field.Store.YES);
document.add(idField);
//analyze the url field so textfield
Field titleField = new TextField(FeedSQLLiteHelper.COLUMN_ARTICLE_TITLE, String.valueOf(contentValues.get(FeedSQLLiteHelper.COLUMN_ARTICLE_TITLE)), Field.Store.YES);
document.add(titleField);
Field feedId= new StringField(FeedSQLLiteHelper.COLUMN_ARTICLE_FEED_ID,String.valueOf(contentValues.get(FeedSQLLiteHelper.COLUMN_ARTICLE_FEED_ID)), Field.Store.YES);
document.add(feedId);
writer.addDocument(document);
} catch (IOException e) {
Log.e(tag, "Unable to add document as " + e.getMessage(), e);
}
}

//searching the articles searchterm is passed and broken down into individual terms
public static ArrayList<String> searchAndGetMatchingIds(String searchTerm) {
ArrayList result=new ArrayList<String>();
//get the searchermanager
SearcherManager searcherManager = FeedReaderApplication.getSearcherManager();
IndexSearcher indexSearcher = null;

indexSearcher = searcherManager.acquire();
//split on space
String[] terms= searchTerm.split("[\\s]+");
//multiple terms are to be searched
SpanQuery[] spanQueryArticleTitle=new SpanQuery[terms.length];
int i=0;
for (String term:terms){
//wildcardquery
WildcardQuery wildcardQuery=new WildcardQuery(new Term(FeedSQLLiteHelper.COLUMN_ARTICLE_TITLE,term.toLowerCase()));
spanQueryArticleTitle[i]=new SpanMultiTermQueryWrapper<WildcardQuery>(wildcardQuery);
i=i+1;
}
//no words between the typed text you could increase this but then performance will be lowered
SpanNearQuery spanNearQuery1=new SpanNearQuery(spanQueryArticleTitle,0,true);
TopDocs topDocs=null;
try {
//execute topN query
topDocs = indexSearcher.search(spanNearQuery1, ProjectConstants.LUCENE_TOP_N);
if(topDocs!=null){
for(ScoreDoc scoreDoc:topDocs.scoreDocs){
Document document= indexSearcher.doc(scoreDoc.doc);
String id= document.get(FeedSQLLiteHelper.COLUMN_ID);
result.add(id);
}
}
} catch (IOException e) {
e.printStackTrace();
}
finally {
try {
searcherManager.release(indexSearcher);
} catch (IOException e) {
Log.e(tag,"Exception while releasing Index Searcher "+e.getMessage(),e);
}
}

return result;
}

//sample delete method

public static void deleteArticlesByFeedId(String feedId){
IndexWriter indexWriter = FeedReaderApplication.getIndexWriter();
TermQuery query=new TermQuery(new Term(FeedSQLLiteHelper.COLUMN_ARTICLE_FEED_ID,feedId));
try {
indexWriter.deleteDocuments(query);
} catch (IOException e) {
Log.e(tag, "Unable to delete document as " + e.getMessage(), e);
}
try {
indexWriter.commit();
} catch (IOException e) {
Log.e(tag, "Unable to commit changes " + e.getMessage(), e);
}
}
}


Note the search method in the above code. It is going to split the query that user passed into individual terms and then search each of those terms by using SpanNearQuery with a word distance of 0, which means that whatever user has typed must be matched without a word gap. For example if user types:  “Sweet Orange ” then those two terms will be matched only if there is no word between them in the article title.  Also note that Lucene returns top matching results, so when you pass these id’s to your database for retrieving the actual data from the database, you must make sure that the returned data is in that order.  Here is the relevant snippet from the AsyncTaskLoader.



 @Override
public List<FeedArticleSummary> loadInBackground() {
//query the searchterm
ArrayList<String> ids= LuceneSearchUtil.searchAndGetMatchingIds(searchTerm);
ArrayList results=new ArrayList();
//returns all the articles that match
HashMap<String,FeedArticleSummary> map= fdb.getEntriesForFeedByIds(ids);
//order them
if(map!=null){
for(String id:ids){
if(map.get(id)!=null){
results.add(map.get(id));
}

}
}

return results;
}


Now, all you need to do is invoke the loader to query and load the data when the user uses the SearchView in your application.  Here are the implemented methods for the SearchView.OnQueryTextListener.



 @Override
public boolean onQueryTextSubmit(String queryText) {
//let's set a threshold
if (queryText!=null&&queryText.trim().length() > 5) {
Loader loader = getActivity().getSupportLoaderManager().getLoader(ProjectConstants.LOADER_ARTICLES_SEARCH);
if (loader != null && !loader.isReset()) {
getActivity().getSupportLoaderManager().restartLoader(ProjectConstants.LOADER_ARTICLES_SEARCH, args, _self);
} else {
getActivity().getSupportLoaderManager().initLoader(ProjectConstants.LOADER_ARTICLES_SEARCH, args, _self);
}
if(!pd.isShowing()){
//show the progressdialog
pd.show();
}
}
return true;
}
@Override
public boolean onQueryTextChange(String s) {
//not handling this
return false;
}




Now in onloadfinished method just replace the data in your arrayadapter and you are set. That's it now you have integrated lucene into your application. 



device-search

Posted on Thursday, October 30, 2014 by raman nanda