Monthly Archives: April 2016

WinMerge

Winmerge is a visual difference application that runs under Windows and that can be a real lifesaver in a great number of situations.

For example, if you are in a programming team and want to synchronize with other members that work on the same source files, you can easily spot the differences and merge your work, and that of the others, in a snap. The interface is very user friendly and very graphical, but after some time everyone should discover that the shortcut keys can speed even more of the workflow.

With the alt+cursor keys you can navigate and merge the differences quickly. The software supports the difference of full trees of directories and you can define filters to visualize only the files that you are interested in. For example, you can exclude the .class files if you are a java developer.

Finally, you can select the “smartness” of the difference display. For example, choosing to ignore the empty lines. On the down side WinMerge has some quirks that can be annoying. The directory difference window is a bit confused, and the version that I have tried had some problems comparing directories over a Samba connection from a Windows PC to a Sun Workstation (nothing serious fortunately). All considered, however, Winmerge is the only windows application that goes near the quality of “Meld” (Unix), and can be recommended without hesitation.

MS Excel Tip

When setting up an MS Excel spreadsheet try naming ranges or cells instead of manually entering complicated labels like A1:E10.

Let’s say you are setting up a spreadsheet to track your finances. If you are like me this spreadsheet has information like passwords and account numbers that for security reasons, I don’t generally want to print. So let’s name a range “quick_print.” Here’s how.

1. Start by selecting the range you want to name with your mouse.
2. From the Insert menu select Name and then Define.
3. Enter the name “quick_print” in the top box and hit OK.

Now any time you want to select this range you can simply use the pull down box at the top that has cell addresses in it or, alternatively, select Edit and then Go To and select the range. You can also use this technique to name a cell using common labels. For example when you enter the balance for your bank account you might want to name this cell “bank”. You can then use the term “bank” in MS Excel formulas like =”bank”+”stocks” instead of complicated formulas using cell addresses.

Apache Lucene

Lucene is the tool used in advanced matching/filtering of services.

It is an open source project hosted by Apache and provides a Java based high-performance, full-featured text search engine library. To search large amounts of text quickly, one must first index the text and convert it into a format that can be searched rapidly, eliminating the slow sequential scanning of each file for the given word or phrase. This conversion process is called indexing, and its output is called an index. Searching is the process of looking up words in an index to find documents where they appear.

Lucene allows to add indexing and searching capabilities to user applications, and can index and make search-able any data that can be converted to a textual format. This means Lucene can be used to search and index information kept in torrents, files, web pages on remote web servers, documents stored in local file systems, simple text files, Microsoft Word documents, HTML files or PDF documents, or any other format from which textual information can be extracted. The product is being used by many well known websites like Wikipedia, an online encyclopedia, as well as in many Java applications. To build an Index Lucene uses different types of analyzers like StandardAnalyzer, WhitespaceAnalyzer, StopAnalyzer, SnowballAnalyzer etc. The analyzer breaks text fields up into index-able tokens and it is the core part of the Lucene. For example; StandardAnalyzer is a sophisticated general-purpose analyzer. WhitespaceAnalyzer is a very simple analyzer which just separates tokens using white space while StopAnalyzer removes common English words which are not usually useful for indexing.