Beyond URLs, we’ll look for specific company and product names, as well as variations on those, trying to understand beyond just the domain, getting more specific about what we track. Once we have a years worth of historical data and pulling URLs in real time via the Stack Overflow streams we are setting up, we’ll start looking at other data points we can aggregate via the platform. By aggregating URLS being referenced, we should be able to better understand which domains are getting the most mindshare amongst developers today. We’d like to understand the potential for using Stack Overflow as a way of keeping our finger on the pulse of which technological trends are growing in strength, and be able to identify newer trends early on. We want to have enough historical data to be able to train some machine learning (ML) models that we can then keep up to date in real time, but also leverage to help predict future trends by week, month, or several months out. Seeing which domains are the most referenced on the site. We’ll be aggregating them, and counting them up by the day, week, and month. Our goal is to establish real-time counts of the URLs being referenced by programmers on Stack Overflow. Now we’d like to just add some code that pulls any URLs present in the streaming responses, and stores them separately for aggregating and counting by domain. We already showed how to proxy the questions API from the Stack Exchange API, establishing a stream of questions and answers from Stack Overflow. This project will rollout in several stages, but to begin working towards this objective, we wanted to begin by processing all URLs that are referenced within each question and answer. Using the Stack Overflow platform we’d like to better understand which companies people are talking about, and eventually, train a set of machine learning models that can be used to make some predictions. The API provides access to a wealth of data about what is happening each day within the tech sector and is something we wanted to explore more when it comes to building machine learning models. The Stack Exchange API provides programmatic access to a variety of QA websites including the wildly popular Stack Overflow questions and answers, where developers can ask questions, and share answers across a variety of topics. We recently showcased how you can proxy the Stack Exchange API using Streamdata.io.
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |