Embeddable Google Document Viewer

Posted September 10, 2009 at 2:30 pm by Akshay

Google Docs offers an undocumented feature that lets you embed PDF files and PowerPoint presentations in a web page. The files don’t have to be uploaded to Google Docs, but they need to be available online.

Here’s the code I used to embed the PDF file:

<iframe src="http://docs.google.com/gview?url=http://infolab.stanford.edu/pub/papers/google.pdf&embedded=true" style="width:600px; height:500px;" frameborder="0"></iframe>

but you should replace the bold URL with your own address. As I mentioned, the document viewer works for PDF and PPT files.

Some other sites that offer similar features: Zoho Viewer, PdfMeNot.

Posted in General, Google and taged as , . No Comments yet

Latest Project: Spans.co.in

Posted August 18, 2009 at 5:35 pm by Akshay

Spans Envirotech provides planning, design and construction management services to meet the water and wastewater needs of municipalities, public agencies, private developers and industrial firms since 1995. An elegant yet modest and minimalistic design was the idea behind this microsite. This project uses WordPress as the underlying CMS and the theme is carefully designed to meet specific client requirements of a dynamic sidebar. By default the sidebar displays their mission statement, however some pages have their own sidebar content. This has been achieved using  ‘custom fields’ within WordPress. If a page has subpages, then the subpages are listed in the sidebar, else it displays the blog category list.

The homepage uses a a custom template (home.php) which draws its content from the about page and the two most recent blog posts. Apart from these, the resources page uses a client side table sorter implemented using the jQuery plugin- Tablesorter 2.0 and a custom JavaScript based download tracker.

HTML Table Queries through WP Web Scraper

Posted August 11, 2009 at 3:07 pm by Akshay

Scraping HTML tables is easy, but parsing them has always been tricky. That’s exactly what my next release of WP Web Scraper will let you do. This feature will have methods to query HTML tables within your scrap. For instance, the scraper will let you filter by value of a specific table column and also restrict the number of rows using a ‘from’ and ‘to’ index key.

Further, it will also let you delete a certain column from the output and also apply specific CSS classes to even and odd rows. This feature is specifically designed for users intending to scrap and filter or parse data extracted from HTML tables. This feature will be implemented as a module within WP Web Scraper.

Posted in Plugins, WordPress and taged as , . One Comment

Commitement to Open Source

Posted July 24, 2009 at 1:23 am by Akshay

Since the time I started writing code in bits and pieces, I always dreamed of creating at-least one major open source project. Although the Open Source conceptually still revolves more around GNU/Linux or like operating system, I personally feel that any piece of code released with its complete source code for general public usage can be broadly categorized as Open Source. For me, it simply gives me pride of being usable to someone whom you don’t even know. Its a blissful feeling to check your mail after days work to find appreciation notes, comments and suggestions on some plugin or widget you have developed.

This post is a small thanks giving note to all those you downloaded my WordPress plugins – Flash Photo Gallery and WP Web Scraper. In all these have received about 6,000 downloads in just about 5 months! Thanks for all your comments, suggestions, bug notifications and Donations!

Google Search as JSON feed

Posted June 30, 2009 at 1:45 pm by Akshay

Did you know that Google has an officially supported JSON feed of search results? Google half-way cancelled their SOAP API a while ago, but they now offer a parametrized URL that returns a JSON data set. Google says this REST approach is useful for “Flash developers, and those developers that have a need to access the AJAX Search API from other Non-Javascript environments.” This may be even simpler to use than the SOAP API. Here’s an example query:

http://ajax.googleapis.com/ajax/services/search/web?v=1.0&q=Paris%20Hilton

This URL format can also be adjusted to grab results from video search, book search and so on. While the URL has the word AJAX in the string and this is officially part of the Google AJAX Search API, this has nothing to do with AJAX per se, as the URL can be called from other environments, including the server side. All you need is a JSON library to parse the results (JSON means JavaScript Object Notation, though it also doesn’t require JavaScript). The Yahoo Search API already utilizes a similar approach, though it can return XML as well. The complete documentation of this Search API can be found on the Developer’s Guide of Google AJAX Search API hosted on Google Code

Posted in API, Google and taged as , , . No Comments yet

Stock Market Data API (Beta release)

Posted June 24, 2009 at 7:56 pm by Akshay

Stock Exchanges around the world currently publish datafeeds in CSV format. This is quite portable and easy to implement, however it cannot be directly used in your web applications without server side code. Also these official feeds are generally very expensive and affordable only by enterprise level developers. Thats where Stock Market Data (SMD) API comes in place. This simple HTTP webservice presents a snapshot of the latest stock market data in various Web 2.0 formats such as JSON, RSS, ATOM, MDDL (Market Data Definition Language – XML for Market Data) or a good old CSV. Currently SMDAPI supports NASDAQ Stock Exchange, New York Stock Exchange (NYSE), Bombay Stock Exchange (BSE) and National Stock Exchange, India (NSE) but we are actively working on supporting others soon.

Please note that SMD API is currently released as Beta and is quite unstable in nature. I am still experimenting with everything from the domain name of this service (smdapi.co.cc which is a temperory one) to the API structure itself. I am also very much open to ideas which I can incorporate in this project before it officially released.

Collaboration – ready to take over Web 2.0

Posted June 12, 2009 at 6:51 pm by Akshay

At first there was good old email with petty 5 Mb inboxes. Then we moved to crowded chat rooms on Yahoo. All that seemed as stone age, when came a host of interactive web applications like Blogs and Social Networking applications. Initially Orkut, then Facebook and then micro blogging thru Twitter. Have you ever wondered what could be the next big Web 2.0 trend?

The answer is ‘Collaboration’. Yes, collaboration not only in its own sense, but also a collaboration of all the trends and technologies mentioned above. I know the word ‘collaboration’ sounds too ‘enterprise’ and not at all ‘social’, but as always web has once again taken us by surprise here. What I am referring is not too far fetched. Big boys like Google and Yahoo have already stepped in this domain and grand releases have already been done. Yes, check out Google Wave and Zimbra (by Yahoo, but not much talked about yet) to understand what am pointing at. Google Wave is a new tool for real-time communication and collaboration on the web, coming later this year. An official 90 minute video demo is already out for a sneak peak into this new wonderland. Don’t have 90 minutes? Have a look at my detailed post on Devil’s Workshop with 30-60 second clips highlighting the best parts of Google Wave.

WP Web Scraper – A WordPress Stock Market plugin

Posted June 8, 2009 at 6:05 pm by Akshay

This is probably a major milestone in the lifecycle of WP Web Scraper WordPress plugin. Technically speaking, the plugin gets in own ‘module architecture’ to incorporate unlimited extensions without touching the core codebase. Speaking non-technically, this opens WP Web Scraper to a non-techie WordPress user. To startoff, this mod extends the plugin with a specific shortcode to get stock market data from NSE and NASDAQ (to start off with, more exchanges soon to come). The data is scraped with a cache interval of a minute (which can be further increased as per your requirement) and includes data types such as Open, High, Low, Last Price, Previous Close, Change, Change Percentage and Volume information for all active symbols on these exchanges.

The plugin API will provides a simple shortcode. For example – [wpws_market_data market="nse" symbol="acc" datatype="last"] or [wpws_market_data market="nasdaq" symbol="csco" datatype="open"]. NSE data is currently scraped from nseindia.com and NASDAQ data is scraped from reuters.com. The immediate plan is to implement all major stock markets in this API. Later, I plan to extend this modular architecture to other categories of scraps such as Weather, Sports scores etc too.

Plan of action for WP Web Scraper

Posted May 27, 2009 at 12:20 pm by Akshay

My latest WordPress plugin for web scraping – WP Web Scrapper was a grand launch. It recorded more than 200 downloads in the first two days itself! Thanks for all the appreciation and comments. This post is mainly to list down my plan to extend WP Web Scrapper into a standard scraping framework. Apart from being a flexible framework, I also plan to introduce some pre-built modules to make specific and highly desired scraping tasks easy. First such module will be a stock market data grabber. This module will extend the plugin to get stock market data from various big exchange websites easily (planning to support NSE, BSE and NASDAQ to start off with). The data will be almost realtime (delay ranging between 1 to 10 mins) and will include Open, High, Low, Last Price, Previous Close, Change, Change Percentage and Volume information for all active symbols on these exchanges.

The plugin API will provide a shortcode something like this – [wpws mod="nse" symbol="acc" datatype="last"] should output the latest price for ACC listed at NSE. The aim is to make it an extendable module framework and hence I am taking time to code it well. Apart from this features, I am also planning to improve the core scrapper with functionalities like a regex powered cleanup function to remove all unwanted text strings from the scrap and also a more flexible algorithm to query html tables returned by the scrap.

Bringing Web Scraping to WordPress!

Posted May 24, 2009 at 11:54 am by Akshay

Web scraping (or Web harvesting, Web data extraction) is a computer software technique of extracting information from websites. Web scraping focuses more on the transformation of unstructured Web content, typically in HTML format, into structured data that can be formatted and displayed or stored and analyzed. Exemplary uses of Web scraping include online price comparison, weather data monitoring, market data tracking, Web content mashup and Web data integration.

Imagine what you can do with all this power in your WordPress blog! Pages and posts can display realtime content from other pages, letting you create a meshup of content. This all is now possible using my WP Web Scraper plugin. Its an easy to implement professional web scrapper for WordPress. This can be used to display real time data from any websites directly into your posts, pages or sidebar. Use this to include real time stock quotes, cricket or soccer scores or any other generic content. The scrapper is built using time tested libraries cURL for scrapping and phpQuery for parsing HTML. Please post all your suggestions and thoughts about this on the WP Web Scraper project page.