Hello!
Thanks to http://www.wordle.net/, I’ve managed to compute a tagcloud representation of the queries within the (in)famous AOL query log. The output picture is pretty nice… Hope you’ll like it!

|
Hello! Thanks to http://www.wordle.net/, I’ve managed to compute a tagcloud representation of the queries within the (in)famous AOL query log. The output picture is pretty nice… Hope you’ll like it!
Aug
22
2008
C++ Class to access a TREC-formatted Document CollectionPosted by: admin in Uncategorized
Aug
18
2008
Google sparse_hash_table example in C++ on Ubuntu LinuxPosted by: admin in UncategorizedDear all, If you are trying to compile the example at http://goog-sparsehash.sourceforge.net/doc/sparse_hash_map.html on a Ubuntu linux box you will obtain an error saying that the hash template cannot be found… Just add using __gnu_cxx::hash; to the list of namespaces. This is the new example code
Dear All, many people today are studying query logs in order to obtain a view on what users usually look for on real-world search engines. Excite logs
Aug
09
2008
I’m Thrilled… A paper We submitted some times ago got accepted by ACM TWEB.Posted by: admin in ResearchI just got to know that a paper of ours (me and other guys at Yahoo! Research Lab in Barcelona) just got accepted by a special issue of ACM Transactions on the Web (TWEB)… I’m really thrilled about it.
Here some details about the paper. Its title is “Design trade-offs for search engine caching” and my co-authors in there are:
Abstract:
”
In this paper we study the trade-offs in designing efficient caching systems for Web search engines. We explore the impact of different approaches, such as static vs. dynamic caching, and caching query results vs. caching posting lists. Using a query log spanning a whole year we explore the limitations of caching and we demonstrate that caching posting lists can achieve higher hit rates than caching query answers. We propose a new algorithm for static caching of posting lists, which outperforms previous methods. We also study the problem of finding the optimal way to split the static cache between answers and posting lists. Finally, we measure how the changes in the query log affect the effectiveness of static caching, given our observation that the distribution of the queries changes slowly overtime. Our results and observations are applicable to different levels of the data-access hierarchy, for instance, for a memory/disk layer or a broker/remote server layer.
” After I’ve experienced a crash in my mySQL database on the previous machine, I’m happy to announce that my website is up and running again! |