Monday, 20 June 2016

Pinot v1.09, GitHub move

Time flies when you're having fun! A little bit more than one year ago, v1.09 came out. And at long last, I have moved the repository away from the now defunct Google Code to GitHub.
The source now lives at

Saturday, 19 July 2014

Pinot v1.08

Version 1.08 is out and fixes issues with filters when building with LLVM. it also adds support for libnotify and brings a small number of internal changes and cleanups.
Many thanks to Thierry Thomas for his help with getting this version out.

Sunday, 25 May 2014

Pinot v1.07

Almost one year after 1.06, I am releasing 1.07.
It fixes known compilation issues and brings some internal changes.

Saturday, 25 May 2013

Pinot v1.06

I am releasing a minor version. It adds support for boost 1.50's Spirit, based on a patch by Thierry Thomas, better handling of potential errors while stepping through SQL records, and minor fixes to the curl backend.
Get the source from the downloads section.

Saturday, 2 March 2013

Pinot 1.05

1.05 fixes some issues with abstract generation and how CJKV text was tokenized.
Get the source tarball here.

Sunday, 10 February 2013

Pinot 1.04

Version 1.04 fixes the handling of accents and other diacritics.
Get the source from the downloads page.
Happy year of the Snake to all!

Monday, 14 January 2013

Pinot 1.03

Version 1.03 is out. It brings in a fix to Unicode handling, broken since 1.01. Upgrading is hugely recommended.
Get the source from the downloads page. The NEWS file is here.

Sunday, 4 November 2012

Pinot 1.02

I have just uploaded version 1.02. Get the source here. It's a minor new version. Main changes are updated Japanese and Brazilian Portuguese translations, as well as a brand new Czech translation. Thanks go to Takafumi Arakaki, Adriano Steffler and Zbyněk Schwarz.

Monday, 27 August 2012

Pinot 1.01

Version 1.01 is out.
It can convert RST files with rst2html, assuming that RST files are correctly detected. See this report for an explanation.
The mbox filter should do a much better job at extracting message parts.
Pinot-index can override MIME type detection based on files extensions, with "--override MIMETYPE:EXTENSION".
Finally, unac was dropped in favour of our own code, resulting in faster indexing.
As usual, see the NEWS file for details, and get the source from the Downloads area.

Saturday, 16 June 2012

Pinot 1.0

Pinot finally reaches version 1.0.
I have been wanting to get a new release out for sometime, and was initially going to number it 0.99 but then realized I should have made the jump a long time ago.
I have not been good at following a regular release schedule. I will try and correct that. There are many things I want to fix, change or add.

There's an initial GTK+ 3 port of the GUI codebase, support for LibreOffice's libexttextcat v3.2 and better mbox parsing and extraction.
The NEWS file has the details. Source is here.

Sunday, 6 November 2011

Pinot 0.98

I have just released v0.98, which is going to be the last release hosted on BerliOS.
As of today, Pinot is moving to Google Code at Please update your bookmarks.

It features two new filters, one for CHM files and another for images that hold Exif, IPTC or XMP metadata. It also supports libexttextcat v3.1, the LibreOffice fork of libtextcat.
Special thanks to Martijn Verstrate and others who helped with translations.

Details can be found in the NEWS file. The source is here.

Saturday, 1 October 2011

Pinot 0.98 is coming

I am planning to release Pinot 0.98 this month. It will have the following features:
- a new filter for CHM files that relies on chmlib.
- a new exiv2-based filter to replace the current libexif filter
- support for libexttextcat, the libtextcat 3.0 fork from LibreOffice.

I found out recently that BerliOS will be closing down at the end of the year; 0.98 will therefore be the last release before I move the whole project somewhere else, probably Google Code.

Sunday, 9 January 2011

Pinot 0.97

After a long break, I am releasing a new version that brings improvements to crawling, memory management and the indexing of acronyms. Several bugs were fixed and translations were massively updated.
I would like to thank all translators involved, and especially Nikolay Kachanov for his feedback. Спасибо большое !

As usual, the NEWS file has the full details and the source can be downloaded here.

Finally, it's come to my attention that Pinot is featured in the Desktop Search section of Datamation's Ultimate List of Open Source Software ! Another good reason to celebrate this New Year ;-)

Thursday, 2 September 2010

Pinot on Gentoo

Josh Saddler blogged about his experience with running Pinot on Gentoo here.
Gentoo users may want to use his up-to-date ebuilds for Pinot and libtextcat.

Monday, 12 July 2010

Pinot 0.96

This version has two major improvements : it's able to index mail messages with content held in a file outside the mbox, and to get the system's battery status via DeviceKit-power or upower.
Several bugs were fixed (including Debian bug #556062). I would encourage anyone using an 0.9x release to upgrade to this.

Note that if you have Xapian 1.2.0 or newer installed, Pinot will use its new smaller, faster Chert back-end.

Special thanks go to Jens Wilhelm Wulf for his feedback and patience ;-)

For the full details, see the NEWS file. Get the source from the download page.

Tuesday, 4 May 2010

Not dead

I haven't done any serious work on Pinot for months, not because I don't want to, but because I really haven't found the time to.
I am targeting a new release at end of June.

Sunday, 6 December 2009

Linux Desktop Search Engines Compared has an article about the various desktop search engines available on Linux. The author is looking for the one most suitable to his needs. Even though he ends up recommending Beagle, Pinot comes out relatively well. Perceived drawbacks are RAM and CPU usage, which I ought to revisit for the next release.

By the way, the comparison table on Wikinfo mentioned in the article can be found here.

Friday, 13 November 2009

Pinot 0.95

At long last, a new release !
0.95 merges in Antoine Jacoutot's patches for the OpenBSD port, fixes the "path:" query filter and the handling of acronyms.

The search plugin for Bing was updated, while the plugins for Exalead and IOI were removed.
Common historical data operations were optimized, which should speed the daemon up a bit.
If you have gtk2 2.16 or newer, the query text field will have an embedded icon on the right-hand side, similarly to Firefox' search box.

Finally, translations in Dutch, French, German, Hebrew, Portuguese and Spanish were updated. Many thanks to the various people who helped with these updates.

The Web site got a lot of visits over the last few months from Bulgaria and the Czech Republic, but unfortunately Pinot hasn't been translated for these countries' languages. If you can help with that, please consider joining Launchpad.

For the full details, see the NEWS file. Get the source from the download page.

Wednesday, 19 August 2009

OpenBSD port

Antoine Jacoutot successfully ported Pinot to OpenBSD. The package is available on
I will merge the related patches into the (much delayed) next release.
Thanks Antoine !

Friday, 26 June 2009

Pinot 0.94

Long time, no release. Pinot 0.94 is out today and brings :
- changes to the daemon's DBus interface to tell the UI to reopen the index when it has changed on disk.
- a new search filter "inurl" that allows finding files nested in an mbox or an archive at a given URL.
- the ability to view the properties of documents from an external index.
- better MIME type detection, which means fewer calls to external uncompressor programs when dealing with archives and fixes cases where documents nested in them couldn't be open and viewed.
- the ability to index Debian packages.
- fixes to the mbox filter to fully work with GMime 2.4 (now required).
- a whole bunch of other bug fixes.

For the full details, see the NEWS file. Get the source tarball and RPM from the download page.

Thursday, 4 June 2009

It's June already

Since the release of 0.93, I haven't been able to spend much time on Pinot. Things should go back to "normal" in a week or two.

I am waiting for the release of Fedora 11, first because I am an avid Fedora user, and second because it comes with gcc 4.4.0 which will probably bring up a couple of interesting issues :-)

Seeveral people have reported a compile error in 0.93 related to a "undefined reference to `DocumentInfo::setSize(long long)'" in Tokenize/FilterUtils.cpp. If you have experienced this, try the fix from SVN revision 1635.

Web stats show that had a lot of visits from Eastern Europe over the last few months, mostly from Bulgaria, Lithuania, the Czech Republic and Poland. These countries' languages are not supported yet, so if anyone would like to help translate Pinot using Launchpad, I am sure the effort will be much appreciated.

Monday, 13 April 2009

Pinot 0.93

A major bug crept in. On each run, the files history was reset in a way that caused the daemon to reindex all files (unless it was run in full scan mode).
I am not sure whether I should be embarrassed I didn't notice this bug earlier, or glad this kind of activity has become so unintrusive in recent releases...

Upgrading to 0.93 is heavily recommended. See the NEWS file, and get the source from the downloads page.

Friday, 10 April 2009

Pinot 0.92

I am releasing a new version this Good Friday.

A lot of work has gone into reducing memory usage on indexing. To start with, getting a grip on the amount of memory used by any program is tricky. Simply freeing unused buffers is not enough. Small buffers or buffers sitting between in-use buffers may not be reclaimed by the OS. For Pinot, this is especially true : the daemon goes through a large number of files of different sizes, reads their contents which is run through filters to extract text. Each of these operations involve a transformation and a new memory buffer to hold the transformed data.
In order to minimize this, 0.92 allocates document content buffers from a memory pool backed by the malloc allocator, instead of the default STL allocator. Once in a while, the pool is released and malloc_trim() is called to hint that freed memory can be returned to the system.

Not to make things easier, measuring memory usage effectively is not exactly straight-forward. I chose to focus on the %MEM figure shown by top for the daemon and found that on my test box, it will rise slower than with 0.91, peak at a value 30 to 50 percent below 0.91's final memory usage, go down to a single-digit value then rise before coming down again cyclically. It's probably not perfect but future releases will bring further tweaks.

There's also a new filter based on libarchive that allows indexing the content of tar files (compressed or not) and ISO images. The UI can open/view files within indexed archives, just like it's long been able to open/view mbox messages and attachments. On the way, I partially redesigned how documents nested in other documents are indexed and as a consequence, indexes created with previous releases will be automatically upgraded.

Finally, 0.91 and older suffered from a bug that could cause a crash when libxml2 2.7.3 was used. This was fixed.

See the NEWS file for the details, and grab the source from the download page.

Friday, 6 March 2009

Pinot 0.91 is out

Time for another release. Ideally, this would have come out before the end of February, but the delay was worth it.

This release focuses on two things :
- fixing memory leaks that hit initial indexing badly.
With the help of valgrind and John Werden, several memory leaks were identified and fixed. I also rewrote the HTML filter based on the HTML parser from Xapian Omega after witnessing problems and deciding ripping the filter's guts out would probably save a lot of time.
- improving command-line integration.
Stored queries created with the UI can be run with pinot-search. Similarly, pinot-index can open My Web Pages, My Documents or any other UI-configured index by name. In addition, it can finally deal with relative paths and index local directories recursively. This makes pinot-index a good alternative to omindex.

I have experienced some crashes in the UI when querying OpenSearch-based engines such as IOI, and found that these crashes went away after downgrading libxml2 from 2.7.3 to 2.7.2. I will look into this in more details during the 0.92 cycle.

See the NEWS file for the complete list of changes. Grab the source from the download page or the binaries from your distro's packages repository in a few days time.

Thursday, 29 January 2009

Pinot 0.90

To celebrate the start of the year of the Ox, I am releasing Pinot 0.90 !

Since the release of 0.89 in September, a lot of changes have been made on several fronts :
- Unicode text.
Charset conversion errors are better handled, tokenizing was improved and leads to far less "rubbish" terms. Issues the UI had with non-Latin locales were resolved
- Portability.
The code base builds with MingW and hopefully without too many problems with GCC 4.4.
- Web metasearch.
Plugins were updated, extracts and results URLs are more accurate.
- more coherent UI.
Some features were previously only available in search mode, others in browse mode; some were duplicated. The new menu layout tries to unify both modes. The status window' refresh is smoother. Preferences can be open separately from the UI. Spelling suggestions are less invasive, they pop up in the same tabs as queries results.
- improved More Like This.
Stored queries generated on More Like This don't include the original query's terms, stopwords, infrequent terms or similar terms if the stemming language is set.
- command-line and desktop integration. implements a "tagged cd", and lets one change the shell's current directory to the directory that matches the path elements passed as parameter. The Deskbar module shows snippets when used with Deskbar 2.24.
- more flexible daemon.
The daemon is smarter at crawling symlinks, it skips those that refer to locations that have been crawled or that it knows will be crawled. It is much better at resuming where it stopped after user interruption. While user-set meta-data would previously be lost on a reindex, or when the file changed on the disk, it's now preserved.

With all these out of the way, I will be able to return to a monthly release cycle where each release brings incremental improvements and bug fixes and bring the project to its 1.0 release.

I would like to thank Adrian Bunk, Adel Gadllah, Martin Michlmayr, C. Scott Ananian and especially John Werden for their contributions to this release in the form of patches, ideas, suggestions and testing. As always the NEWS file has the details. Head to the download page to get the source.

Friday, 5 December 2008

Fedora 10 and MingW

I upgraded to Fedora 10 a few days after it came out. This looks like the best Fedora release ever. Upgrade went very well and the few issues I used to run into with Fedora 9 (sound, mostly) have vanished.
Plymouth, the tweaked boot process works like a charm on my Macbook Pro, which surprises me since I thought anything other than ATI chipsets got a text based boot.

One thing I am especially excited about is the integration of MingW for cross-compiling. Right now, it's not officially supported but it's enabled me to port the bulk of Pinot's source code over to Windows without having to jump through the many hoops that installing Windows+compiler+dependencies would have involved. I love the idea of compiling for Windows without Windows ;-) As a result, Pinot 0.90 will very probably include a beta-quality Windows port.

Congratulations to all Fedora developers !

Pinot goes nuclear

Funny... Every week I run the Web logs through a script of mine that geolocates visitors and queries whois. It turns out that has got a lot of visits from people at the French and US Departments of Nuclear Energy, namely the Commissariat a l'Energie Atomique based at Gif-sur-Yvette, France and the Department of Energy, Office of Energy Research at Germantown, MD, USA.
I would like to know what is up with that and if/how these two research labs are using Pinot :-)

Sunday, 2 November 2008

Pinot on the OLPC ?

Two weeks ago, I got a pleasant surprise. C. Scott Ananian, a OLPC hacker, used Pinot for his work on the next generation Journal program and gave a talk on this. Screencasts, slides, audio and video are available here.

Target for Pinot 0.90

Based on my tasks list for Pinot 0.90, I reckon it won't be ready before the end of the year. Most likely, it will be released sometime in January.
Besides fresh ideas I got from talking to people on the mailing list, there's a bunch of features I meant to implement for a while that really ought to go in. I hope I'll have time to refresh the UI as well as enhance the daemon.

Friday, 24 October 2008


I am moving my blog from the old BerliOS shared blog to here. Stay tuned.