Personalized newsfeed with "interesting" articles

Tl;dr: I build my own custom feed of personally interesting articles by scraping and feeding them to a model – without extensive ai knowledge.

Ain’t Nobody Got Time For That

Did you ever feel exhausted by the sheer number of news articles we can consume? Hundreds of websites with so many articles no one can read them all. So, we try to pick and choose something relevant by skimming over the first few headlines hoping nothing falls through the cracks. It’s the google page 2 problem all over again. Sure, some websites offer personalized news feeds but not nearly enough (not to mention the quality of those…).

Now taking a step back and looking at the problem I thought to myself: “There must be a solution for this”. Then it dawned on me. AI! Let some piece of software do that for me. But how does that even work? As you may or may not know AI works better with more data. But there are no public data sets with my exact interest. And what’s more, I don’t have the time / or motivation to label >1000 data points myself…

Time for a “plan”

  • Scrape some data
  • Label a few data-points
  • Extrapolate labels (enter refinery)
  • Train an AI model (enter automl-docker)
  • Create a script that scrapes every morning and feeds them to the model for my personal newsfeed

Could you be more specific?

Yes, I could but this is just to outline the general idea since it’s a pattern that could be applied to more than just interesting news articles.

If you are specifically interested in how I achieved one or more of the steps, feel free to ask :blush: