This notebook compares the results and discrepancies between some information found in in the API and the dev.to website.
import json
import pandas as pd
import plotly.graph_objects as go
import plotly.io as pio
pio.templates.default = 'plotly_white'
pio.renderers.default = 'notebook'
The top_articles_by_tag.json
file has the 100 tags shown in page https://dev.to/tags, and the two "Number of posts published": the one from the tags page (tag.num_articles
), and the one from the individual tag page https://dev.to/tags/total
).
The top_articles_by_tag_api.json
file has the 100 top tags returned by the API, and only the number of posts from the individual tag page https://dev.to/tags/total
).
Both have the top 100 articles for each tag.
with open('../top_articles_by_tag.json') as f:
data = json.load(f)
with open('../top_articles_by_tag_api.json') as f:
data_api = json.load(f)
Let's see the difference in "Number of posts published" shown in the tags page vs the individual tag page:
count_diff = tags = pd.DataFrame([
[
entry['tag']['name'],
entry['total'],
entry['tag']['num_articles'],
] for entry in data],
columns=['tag', 'tags_page', 'tag_page']).sort_values('tags_page', ascending=False)
df = count_diff
x = df['tag']
fig = go.Figure()
fig.add_trace(go.Scatter(x=x, y=df['tags_page'], mode='lines+markers', name='On tags page'))
fig.add_trace(go.Scatter(x=x, y=df['tag_page'], mode='lines+markers', name='On individual tag page'))
fig.update_layout(
xaxis=dict(title='tag', tickmode='linear'),
legend=dict(orientation='h', yanchor='auto', y=1.0, xanchor='auto', x=.5)
)
fig.show()