Skip to content
Automate Bard

Automate Bard

Supercharge Your Productivity with AI

  • Stock Data
  • AI Image Up-Scaler
  • Home
  • 2023
  • April
  • 20
  • Gather Headlines With This Simple Site Crawler

Gather Headlines With This Simple Site Crawler

Posted on April 20, 2023May 12, 2023 By user No Comments on Gather Headlines With This Simple Site Crawler
Automation, data gathering

In this post, we will show a simple Python script using Selenium and ChromeDriver to connect to CNBC.com and print out all headlines on the front page.

ChromeDriver

Before the code can talk with your Google Chrome web browser, you must download and install the proper ChromeDriver.exe file. See this post for instructions on how to download and install the proper ChromeDriver for the version of Google Chrome you have.

Automating Google’s Bard AI Using Python & Selenium

Python Code

# Python code generated by AutomateBard.com

from selenium import webdriver
from selenium.webdriver.common.by import By


# Main Script
if __name__ == '__main__':

    try:
        print("Connecting to ChromeDriver")
        driver = webdriver.Chrome('./chromedriver')
        driver.implicitly_wait(1.0)

        print("Connecting to CNBC")
        driver.get("https://www.cnbc.com")

        # Find all news headlines
        headlines = driver.find_elements(By.CLASS_NAME, "RiverHeadline-headline")

        # Print all headlines
        for headline in headlines:
            print(headline.text)

    except Exception as e:
        # DO NOT DO THIS. Use proper exception handling!
        print(e)

    finally:
        print("Closing ChromeDriver")
        driver.close()

How The Code Works

The code first connects to www.CNBC.com. Then, it searches the site for all CSS elements that have the class “RiverHeadline-headline”. NOTE: Be sure to use .find_elements vs .find_element if you want to capture more than one item. Because .find_elements returns an iterator, we can use a for loop to print out each headline.

Results

The program returned the following headlines on April 20, 2023 on CNBC.com:

  • MyPillow CEO Mike Lindell ordered to pay $5 million to man who debunked election-fraud claim
  • Small caps will be large this year, says Jefferies. Here are 10 buy ideas
  • DOJ charges 18 people — including doctors — in massive Covid health-care fraud takedowns
  • Read the internal memo Alphabet sent in merging A.I.-focused groups DeepMind and Google Brain
  • Alec Baldwin lawyers say manslaughter charges to be dropped in ‘Rust’ movie set shooting
  • Kyiv says it’s time for NATO to invite Ukraine into the alliance — not just to a summit
  • Ford F-150 Lightning fire footage highlights a growing EV risk
  • Kendall’s $29 million ‘Succession’ home: We went to ‘great lengths to create something truly unique’
  • Taylor Swift sidestepped FTX lawsuit by asking a simple question
  • Wells Fargo says this regional bank stock that got caught up in crisis should rebound by 60%
  • Coinbase secures Bermuda license, and EU approves framework for crypto regulation: CNBC Crypto World
  • BuzzFeed will lay off 15% of staff, shutter its news unit
  • Senate invites Supreme Court Chief Justice Roberts to testify after Clarence Thomas ethics scandal
  • Savings account interest rates just hit a 15-year high, but fewer Americans are benefitting
  • This free online paycheck withholdings tool may help you avoid a tax bill for 2023, IRS says
  • Nvidia, Microsoft and more: CNBC’s ‘Halftime Report’ traders answer your questions
  • Meta and Disney begin cutting jobs. Here’s a rundown of all Club names planning major layoffs
  • AT&T shares sink after company posts softer than expected revenue, cash flow
  • 10 financial lessons to learn as your priorities shift in your 20s, 30s, 40s and 50s
  • Tesla shares fall on earnings drop

Other Considerations

Each website may have a different way to classify their headlines in CSS. For example, while CNCB the class name “RiverHeadline-headline”, MarketWatch.com uses the class name “article__headline” for theirs. Google seems to use the “h4” tag on its new.google.com site.

For Google’s News Use:

headlines = driver.find_elements(By.TAG_NAME, "h4")

Stay in the loop! Sign up for new post alerts from AutomateBard.com

* indicates required
/* real people should not fill this in and expect good things - do not remove this or risk form bot signups */

Related

Post navigation

❮ Previous Post: Automate Batch Processing of Headlines for Sentiment Analysis with Google’s Bard
Next Post: How to Use Bard to Work with JSON Data ❯

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

Archives

  • December 2023
  • November 2023
  • October 2023
  • September 2023
  • August 2023
  • June 2023
  • May 2023
  • April 2023

Recent Posts

  • How To Program A Conversational AI (ChatBot) Using Mistral-7B-Instruct Tutorial
  • How To Get Started With Mistral-7B-Instruct-v0.2 Tutorial
  • How To Get Started With Stable Diffusion Video Using Python – Tutorial
  • How To Get Started With Mistral-7B Tutorial
  • Can Bard Read PDFs, URLs and Images Containing Text with Gemini?

Categories

  • AI News
  • Articles
  • Automation
  • bard prompts
  • Chat-GPT
  • data gathering
  • DeepMind
  • Food
  • Gemini
  • HuggingFace
  • midjourney
  • Mistral
  • OpenAI
  • optimization
  • productivity
  • python
  • stable diffusion
  • stock data
  • testing bard
  • Uncategorized
  • Using AI Effectively
  • using bard
  • Stock Data
  • AI Image Up-Scaler

Copyright © 2025 Automate Bard.

Theme: Oceanly by ScriptsTown