Use Python to get the web page data in Epiphany

Sometimes, things are not that straightforward as one might think. Yesterday, I spent over two hours on the Epiphany Python Console checking almost all of the available functions in order to find a way to store the displayed page’s HTML data in a variable. Before quitting, I decided to get some help over at the #Epiphany IRC channel. All credit for the following Python code goes to JFR.

So, assuming that a web page is loaded in an epiphany tab, launch the Python Console from within the browser.

We will need the epiphany module, so import it:

import epiphany

Next we assign the active tab and its respective embed to variables:

tab = window.get_active_tab()
embed = tab.get_embed()

And now the critical part of getting the page’s HTML code:

persist = epiphany.ephy_embed_factory_new_object( epiphany.EmbedPersist )
persist.set_flags( epiphany.EMBED_PERSIST_NO_VIEW | epiphany.EMBED_PERSIST_COPY_PAGE )
persist.set_embed( embed )
page = persist.to_string()

Then, you can just print the page code in the console, save it to files or filter it to get the info you want.

Use Python to get the web page data in Epiphany by George Notaras is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.
Copyright © 2006 - Some Rights Reserved

George Notaras avatar

About George Notaras

George Notaras is the editor of the G-Loaded Journal, a technical blog about Free and Open-Source Software. George, among other things, is an enthusiast self-taught GNU/Linux system administrator. He has created this web site to share the IT knowledge and experience he has gained over the years with other people. George primarily uses CentOS and Fedora. He has also developed some open-source software projects in his spare time.