Use Python to get the web page data in Epiphany
Sometimes, things are not that straightforward as one might think. Yesterday, I spent over two hours on the Epiphany Python Console checking almost all of the available functions in order to find a way to store the displayed page’s HTML data in a variable. Before quitting, I decided to get some help over at the #Epiphany IRC channel. All credit for the following Python code goes to JFR.
So, assuming that a web page is loaded in an epiphany tab, launch the Python Console from within the browser.
We will need the epiphany module, so import it:
import epiphany
Next we assign the active tab and its respective embed to variables:
tab = window.get_active_tab() embed = tab.get_embed()
And now the critical part of getting the page’s HTML code:
persist = epiphany.ephy_embed_factory_new_object( epiphany.EmbedPersist ) persist.set_flags( epiphany.EMBED_PERSIST_NO_VIEW | epiphany.EMBED_PERSIST_COPY_PAGE ) persist.set_embed( embed ) page = persist.to_string()
Then, you can just print the page code in the console, save it to files or filter it to get the info you want.
The Use Python to get the web page data in Epiphany by George Notaras, unless otherwise expressly stated, is licensed under a Creative Commons Attribution-Noncommercial-Share Alike 3.0 Unported License.
Related Articles
- Epiphany Python Console – Open New Tab
- Epiphany Python Console – Documentation
- A Note About The Epiphany Extensions on Fedora
- Tab Session Management extension for Epiphany
- Tab Links extension for the Epiphany browser
Tags: Epiphany, GNOME, Programming, Python, Tips