This notebook compares the results and discrepancies between some information found in in the API and the dev.to website.

Imports¶

In [1]:
import json

import pandas as pd
import plotly.graph_objects as go
import plotly.io as pio

pio.templates.default = 'plotly_white'
pio.renderers.default = 'notebook'

Load the data¶

The top_articles_by_tag.json file has the 100 tags shown in page https://dev.to/tags, and the two "Number of posts published": the one from the tags page (tag.num_articles), and the one from the individual tag page https://dev.to/tags/ (total).

The top_articles_by_tag_api.json file has the 100 top tags returned by the API, and only the number of posts from the individual tag page https://dev.to/tags/ (total).

Both have the top 100 articles for each tag.

In [2]:
with open('../top_articles_by_tag.json') as f:
    data = json.load(f)

with open('../top_articles_by_tag_api.json') as f:
    data_api = json.load(f)

"Number of posts published" discrepancy¶

Let's see the difference in "Number of posts published" shown in the tags page vs the individual tag page:

In [3]:
count_diff = tags = pd.DataFrame([
    [
        entry['tag']['name'], 
        entry['total'],
        entry['tag']['num_articles'], 
    ] for entry in data], 
    columns=['tag', 'tags_page', 'tag_page']).sort_values('tags_page', ascending=False)
In [4]:
df = count_diff
x = df['tag']

fig = go.Figure()
fig.add_trace(go.Scatter(x=x, y=df['tags_page'], mode='lines+markers', name='On tags page'))
fig.add_trace(go.Scatter(x=x, y=df['tag_page'], mode='lines+markers', name='On individual tag page'))

fig.update_layout(
    xaxis=dict(title='tag', tickmode='linear'),
    legend=dict(orientation='h', yanchor='auto', y=1.0, xanchor='auto', x=.5)
)
fig.show()