How to extract text from HTML file Python?

How to fetch HTML content in Python?

I have a html source in a variable, for example : My document

Hello World

.

And I want to fetch it in python to get some information from it (and save the content of the page), what can I do

I tried something like this: htmlsource = " My document

Hello World

" # Get the content of the variable htmlsource print(re. Do you want to print the content of the page or parse it? For the latter you can use BeautifulSoup or lxml: from bs4 import BeautifulSoup. Soup = BeautifulSoup(htmlsource, "lxml"). Print(soup.text) See also this post for more info.

Related Answers

What is the eligibility criteria for admission to Web scraping courses?

What resources do I need to learn web scraping? Are there specific skills that...

What type of data can be scraped?

The following types of data can be scraped by a bot: Data for news sites:...

Is Python good for Selenium?

Most of the stuff I've been doing for programming assignments so far...