« Setting up Subversion and WebSVN
Logwatch and Dovecot 1.x series in FC5 »

Use Python to get the web page data in Epiphany

September 23rd, 2006 by George Notaras

Sometimes, things are not that straightforward as one might think. Yesterday, I spent over two hours on the Epiphany Python Console checking almost all of the available functions in order to find a way to store the displayed page’s HTML data in a variable. Before quitting, I decided to get some help over at the #Epiphany IRC channel. All credit for the following Python code goes to JFR.

So, assuming that a web page is loaded in an epiphany tab, launch the Python Console from within the browser.

We will need the epiphany module, so import it:

import epiphany

Next we assign the active tab and its respective embed to variables:

tab = window.get_active_tab()
embed = tab.get_embed()

And now the critical part of getting the page’s HTML code:

persist = epiphany.ephy_embed_factory_new_object( epiphany.EmbedPersist )
persist.set_flags( epiphany.EMBED_PERSIST_NO_VIEW | epiphany.EMBED_PERSIST_COPY_PAGE )
persist.set_embed( embed )
page = persist.to_string()

Then, you can just print the page code in the console, save it to files or filter it to get the info you want.

The Use Python to get the web page data in Epiphany by George Notaras, unless otherwise expressly stated, is licensed under a Creative Commons Attribution-Noncommercial-Share Alike 3.0 Unported License. Terms and conditions beyond the scope of this license may be available at www.g-loaded.eu.

Related Articles

Tags: , , , ,

Bookmark and Share

Comments are automatically disabled after a certain period of time. Further discussion about the published content is still possible though in the G-Loaded Forums.