More info on the repository README: https://github.com/derlin/dev.to-is-for-web-devs-and-beginners
I am on dev.to for a while now, and looking at the featured articles, I keep wondering: is this platform only for web developers and beginners ? I mean, most of the articles that trigger comments and reactions are about javascript, react, css or beginners, newbie, etc.
I have a few articles, that I consider not that bad (not that great either), but they don't seem to take. Might it be because I am not writing about stuff that the community cares about ?
Instead of staying in the dark, I tried to better understand what works and doesn't work on dev.to. Here are the results of my peregrinations.
dev.to is based on forem, which has an API available: https://developers.forem.com/api/v0. Note though that some of the information is incorrect, or at least doesn't work as expected when tried against https://dev.to/api.
The data that I wanted were about tags and articles. More precisely:
To get the top tags, I can use the endpoint https://dev.to/api/tags
(described in the forem docs as returning tags ordered by popularity), along with a per_page
parameter:
https://dev.to/api/tags?per_page=100&page=0
I can also fetch the https://dev.to/tags page, and extract the tag-card
elements.
The problem is, they are not giving the same results... Using the api to fetch the first 100 tags, I got 4 tags that are now 404 Not Found: macosapps
, southafricanews
sportnews
, latestnigerianewslat
. This led me to decide to use the https://dev.to/tags page.
For the number of articles published with a specific tag, there are two numbers:
https://dev.to/t/<TAG>
The problem is, they don't match at all ! (1) being often way higher than (2). For example, at the time of writing, the archlinux
tag shows "34635 posts published" on the tags page, but "151 Posts Published" on the https://dev.to/t/archlinux page... To settle this, I scrolled until there was no more new fetch, and got 181 articles.
This led me to decide to use the individual tag pages (2) for results.
See the difference notebook for more info.
To get the top articles, I had to look into the developer console to figure out how dev.to did it. Here is the query url (cleaned a bit to remove unused query parameters):
https://dev.to/search/feed_content?class_name=Article&sort_by=public_reactions_count&sort_direction=desc
The latter returns the top articles of all time. To get the top articles by tag, I can add the query parameter tag_names[]=tag1, tag2, ...
(note the space). The actual dev.to site also passes a tag
parameter, but it doesn't seem to be used.
The search
endpoint has a limit on the number of results returned: max 100. To get more, I need to use the page
and per_page
query parameters to loop through the paged results.
This is the theory. In practice, I put up a small python script to do my bidding, see fetch/devto.py.
import json
import pandas as pd
import numpy as np
import plotly.express as px
import plotly.graph_objects as go
import plotly.io as pio
import itertools
pio.templates.default = 'plotly_white'
pio.renderers.default = 'notebook'
First, let's load our data into a dataframe. The top_articles_by_tag.json
has the following structure:
[
{
"tag": {
"name": "...",
"num_articles": N // (1) count from the https://dev.to/tags page
},
"top_articles": [
{
"id": 468082,
"title": "...",
"comments_count": 112,
"public_reactions_count": 10893,
"reading_time": 5,
"tag_list": [...],
// ... and much more
},
"total": N // (2) count from the https://dev.to/t/<TAG> page
}
]
with open('../top_articles_by_tag.json') as f:
tags_data = json.load(f)
def sum_article_prop(entry, prop):
return sum(article[prop] for article in entry['top_articles'])
tags = pd.DataFrame([
[
entry['tag']['name'],
entry['total'],
sum_article_prop(entry, 'public_reactions_count'),
sum_article_prop(entry, 'comments_count')
] for entry in tags_data
],
columns=['tag', 'count', 'reactions', 'comments'])
tags = tags.sort_values('count', ascending=False).reset_index()
Let's display the total number of articles per tag (count), and the number of comments and reactions on the first 100 articles for each.
IMPORTANT: remember that one article can have up to 4 tags, so a very liked article may boost the scores of multiple tags !
fig = go.Figure()
for column in ['count', 'reactions', 'comments']:
print(f'Total {column}: {tags[column].sum():,}')
fig.add_trace(go.Scatter(x=tags.tag, y=tags[column], mode='lines+markers', name=column))
fig.update_layout(
title='Top tags',
xaxis=dict(title='tag', tickmode='linear'),
margin=dict(l=0, r=0, t=30, b=0),
legend=dict(orientation='h', yanchor='auto', y=1.0, xanchor='auto', x=.5)
)
fig.show()
fig.write_html("plot_tags.html")
Total count: 621,345 Total reactions: 4,211,041 Total comments: 211,510
Let's plot exactly the same data, but with normalized Y-axis. That is, the Y-axis shows the percentage of count/reactions/comments the tag represents.
fig = go.Figure()
for column in ['count', 'reactions', 'comments']:
fig.add_trace(go.Scatter(x=tags.tag, y=tags[column] / tags[column].sum() * 100, mode='lines+markers', name=column))
fig.update_layout(
title='Top tags - normalized',
xaxis=dict(title='tag', tickmode='linear'),
yaxis=dict(title='%', ticklabelposition='inside'),
margin=dict(l=0, r=0, t=30, b=0),
legend=dict(orientation='h', yanchor='auto', y=1.0, xanchor='auto', x=.5)
)
fig.show()
fig.write_html("plot_tags_normalized.html")
Looking at the number of articles (count) we can see an exponential curve here. javascript
, webdev
and beginners
are really dominating the scene.
The last two "top tags" according to dev.to have only 244 and 151 articles respectively, letting me think that either their calculation has some bugs, or that there is a limited number of trendy tags.
def pp(condition):
return '\n ' + ', '.join(tags[condition].tag.tolist())
bold = '\033[1m'
reset = '\x1b[0m'
for high, low in itertools.pairwise([70, 30, 20, 10, 5, 2, 1, 0]):
print(f'Between {bold}{low}K-{high}K{reset} articles:')
condition = tags['count'].between(low * 1000, high * 1000)
print(' ' + ', '.join(tags[condition].tag.tolist()))
print()
Between 30K-70K articles: javascript, webdev, beginners Between 20K-30K articles: tutorial, react, programming Between 10K-20K articles: python, discuss, productivity, css, career, node, devops, codenewbie Between 5K-10K articles: html, opensource, typescript, aws, showdev, github, java, testing, docker, php, security, linux, vue, ruby, git Between 2K-5K articles: angular, go, database, dotnet, csharp, serverless, machinelearning, kubernetes, rails, computerscience, cloud, android, design, laravel, azure, api, algorithms, architecture, help, learning, datascience, vscode, reactnative, graphql, frontend, nextjs, flutter, watercooler, django, ios, codepen, sql, rust, todayilearned, blockchain, performance Between 1K-2K articles: hacktoberfest, startup, kotlin, motivation, news, coding, challenge, mongodb, development, microservices, tailwindcss, postgres, cpp, npm, ux, gamedev, wordpress, writing, devjournal, mobile, dart, leetcode, ai, agile, firebase, management, tooling, meta, braziliandevs, mysql Between 0K-1K articles: web3, community, cybersecurity, actionshackathon21, archlinux
The comments and reactions do not fit exactly the number of articles. Remember, however, that we are just looking at the top 100 articles for each tag, so it may not be representative.
To see more clearly, let's show the rank in each category. In the latter table, the tags are ranked on count, reactions and comment, and the ∂X
columns are the difference between the rank in count (number of articles) and the rank in reactions (∂r
) or comments (∂c
). In other words, ∂r=-2
means the tag is 2 ranks lower in reactions than in count. It helps seeing better the peaks and drops from the plot above.
df = tags[['tag']].copy()
for column in ['count', 'reactions', 'comments']:
df[column] = ((df.shape[0]+1) - tags[column].rank()).astype('int32')
df['∂r'] = df['count'] - df.reactions
df['∂c'] = df['count'] - df.comments
df.style\
.hide(axis='index')\
.background_gradient(subset='∂r', cmap='seismic_r', vmin=-abs(df['∂r'].max()), vmax=abs(df['∂r'].max()))\
.background_gradient(subset='∂c', cmap='seismic_r', vmin=-abs(df['∂c'].max()), vmax=abs(df['∂c'].max()))
tag | count | reactions | comments | ∂r | ∂c |
---|---|---|---|---|---|
javascript | 1 | 2 | 2 | -1 | -1 |
webdev | 2 | 1 | 1 | 1 | 1 |
beginners | 3 | 3 | 5 | 0 | -2 |
tutorial | 4 | 9 | 13 | -5 | -9 |
react | 5 | 5 | 7 | 0 | -2 |
programming | 6 | 10 | 15 | -4 | -9 |
python | 7 | 15 | 28 | -8 | -21 |
discuss | 8 | 32 | 3 | -24 | 5 |
productivity | 9 | 6 | 8 | 3 | 1 |
css | 10 | 7 | 10 | 3 | 0 |
career | 11 | 4 | 6 | 7 | 5 |
node | 12 | 11 | 17 | 1 | -5 |
devops | 13 | 16 | 38 | -3 | -25 |
codenewbie | 14 | 13 | 18 | 1 | -4 |
html | 15 | 8 | 11 | 7 | 4 |
opensource | 16 | 17 | 9 | -1 | 7 |
typescript | 17 | 20 | 23 | -3 | -6 |
aws | 18 | 35 | 57 | -17 | -39 |
showdev | 19 | 21 | 14 | -2 | 5 |
github | 20 | 12 | 21 | 8 | -1 |
java | 21 | 31 | 39 | -10 | -18 |
testing | 22 | 46 | 66 | -24 | -44 |
docker | 23 | 22 | 46 | 1 | -23 |
php | 24 | 55 | 35 | -31 | -11 |
security | 25 | 25 | 32 | 0 | -7 |
linux | 26 | 29 | 19 | -3 | 7 |
vue | 27 | 24 | 24 | 3 | 3 |
ruby | 28 | 65 | 50 | -37 | -22 |
git | 29 | 14 | 20 | 15 | 9 |
angular | 30 | 28 | 40 | 2 | -10 |
go | 31 | 47 | 53 | -16 | -22 |
database | 32 | 52 | 68 | -20 | -36 |
dotnet | 33 | 63 | 65 | -30 | -32 |
csharp | 34 | 72 | 63 | -38 | -29 |
serverless | 35 | 58 | 64 | -23 | -29 |
machinelearning | 36 | 60 | 92 | -24 | -56 |
kubernetes | 37 | 43 | 91 | -6 | -54 |
rails | 38 | 70 | 52 | -32 | -14 |
computerscience | 39 | 18 | 30 | 21 | 9 |
cloud | 40 | 54 | 84 | -14 | -44 |
android | 41 | 76 | 70 | -35 | -29 |
design | 42 | 19 | 33 | 23 | 9 |
laravel | 43 | 79 | 51 | -36 | -8 |
azure | 44 | 84 | 78 | -40 | -34 |
api | 45 | 37 | 54 | 8 | -9 |
algorithms | 46 | 38 | 73 | 8 | -27 |
architecture | 47 | 27 | 44 | 20 | 3 |
help | 48 | 91 | 22 | -43 | 26 |
learning | 49 | 30 | 25 | 19 | 24 |
datascience | 50 | 83 | 94 | -33 | -44 |
vscode | 51 | 23 | 16 | 28 | 35 |
reactnative | 52 | 45 | 45 | 7 | 7 |
graphql | 53 | 41 | 69 | 12 | -16 |
frontend | 54 | 26 | 41 | 28 | 13 |
nextjs | 55 | 39 | 49 | 16 | 6 |
flutter | 56 | 71 | 77 | -15 | -21 |
watercooler | 57 | 48 | 12 | 9 | 45 |
django | 58 | 73 | 72 | -15 | -14 |
ios | 59 | 78 | 74 | -19 | -15 |
codepen | 60 | 50 | 42 | 10 | 18 |
sql | 61 | 56 | 81 | 5 | -20 |
rust | 62 | 68 | 62 | -6 | 0 |
todayilearned | 63 | 62 | 71 | 1 | -8 |
blockchain | 64 | 44 | 43 | 20 | 21 |
performance | 65 | 49 | 58 | 16 | 7 |
hacktoberfest | 66 | 74 | 26 | -8 | 40 |
startup | 67 | 57 | 56 | 10 | 11 |
kotlin | 68 | 93 | 87 | -25 | -19 |
motivation | 69 | 33 | 27 | 36 | 42 |
news | 70 | 61 | 31 | 9 | 39 |
coding | 71 | 36 | 37 | 35 | 34 |
challenge | 72 | 77 | 36 | -5 | 36 |
mongodb | 73 | 69 | 59 | 4 | 14 |
development | 74 | 40 | 55 | 34 | 19 |
microservices | 75 | 59 | 88 | 16 | -13 |
tailwindcss | 76 | 53 | 34 | 23 | 42 |
postgres | 77 | 81 | 80 | -4 | -3 |
cpp | 78 | 66 | 76 | 12 | 2 |
npm | 79 | 64 | 47 | 15 | 32 |
ux | 80 | 42 | 60 | 38 | 20 |
gamedev | 81 | 89 | 89 | -8 | -8 |
wordpress | 82 | 94 | 75 | -12 | 7 |
writing | 83 | 51 | 29 | 32 | 54 |
devjournal | 84 | 80 | 61 | 4 | 23 |
mobile | 85 | 86 | 93 | -1 | -8 |
dart | 86 | 88 | 90 | -2 | -4 |
leetcode | 87 | 99 | 100 | -12 | -13 |
ai | 88 | 92 | 86 | -4 | 2 |
agile | 89 | 90 | 83 | -1 | 6 |
firebase | 90 | 82 | 85 | 8 | 5 |
management | 91 | 85 | 82 | 6 | 9 |
tooling | 92 | 75 | 79 | 17 | 13 |
meta | 93 | 34 | 4 | 59 | 89 |
braziliandevs | 94 | 95 | 96 | -1 | -2 |
mysql | 95 | 97 | 95 | -2 | 0 |
web3 | 96 | 67 | 67 | 29 | 29 |
community | 97 | 87 | 47 | 10 | 50 |
cybersecurity | 98 | 96 | 99 | 2 | -1 |
actionshackathon21 | 99 | 98 | 98 | 1 | 1 |
archlinux | 100 | 100 | 97 | 0 | 3 |
From the reactions point of view:
some types of tags clearly generate "reaction peaks":
react
, css
, html
, vscode
, vue
, frontend
...productivity
, career
, computerscience
, design
, architecture
, learning
, motivation
...github
, git
, and docker
, which are basic tools any (web) developer should knowon the other hand, some groups of tags show "reaction drops":
php
, ruby
, android
, laravel
, kotlin
devops
, cloud
, aws
, azure
datascience
and testing
(← this one was unexpected)From the comments point of view:
discuss
, watercooler
, meta
, writing
, help
, showdev
. I am not surprised to also see peaks for tags like career
, productivity
or hacktoberfest
,vscode
has a huge comment bump, I guess because it is the editor of choice for web developers (and IDEs are sacred),linux
and opensource
generate many comments, which was unexpected,python
, devops
, aws
, testing
, machinelearning
, kubernetes
, aws
, datascience
, ...As it is quite difficult to interpret the results based on the top tags, let's look instead at the top 10,000 articles of all time.
with open('../top_articles.json') as f:
articles_data = json.load(f)
articles = pd.DataFrame(
[
[
sorted(article.get('tag_list', [])),
article['id'],
article['title'],
article['published_at_int'],
article['public_reactions_count'],
article['comments_count'],
article['reading_time']
] for article in articles_data
], columns=['tags','id','title','date','reactions', 'comments', 'readtime'])
articles.describe()
id | date | reactions | comments | readtime | |
---|---|---|---|---|---|
count | 1.000000e+04 | 1.000000e+04 | 10000.000000 | 10000.00000 | 10000.000000 |
mean | 4.515240e+05 | 1.593509e+09 | 328.960900 | 22.65850 | 5.981400 |
std | 3.367066e+05 | 3.701980e+07 | 446.039356 | 45.30416 | 6.785548 |
min | 3.560000e+02 | 1.461888e+09 | 122.000000 | -25.00000 | 0.000000 |
25% | 1.626262e+05 | 1.567225e+09 | 151.000000 | 6.00000 | 3.000000 |
50% | 3.603605e+05 | 1.592997e+09 | 204.000000 | 13.00000 | 5.000000 |
75% | 7.354550e+05 | 1.624631e+09 | 334.000000 | 25.00000 | 7.000000 |
max | 1.250952e+06 | 1.668080e+09 | 12687.000000 | 1190.00000 | 195.000000 |
As each article has a list of tags, let's "explode" the dataframe: an article with 4 tags will give rise to four columns, one for each tag, then sum the number of articles / reactions /comments per tag:
articles_by_tag = articles.explode('tags').groupby('tags').agg(
count=pd.NamedAgg('id', 'count'),
reactions=pd.NamedAgg('reactions', 'sum'),
comments=pd.NamedAgg('comments', 'sum')
).sort_values('count', ascending=False).reset_index().rename(columns={'tags': 'tag'})
print("How the dataframe looks:")
display(articles_by_tag.head())
print("Statistics:")
articles_by_tag.describe()
How the dataframe looks:
tag | count | reactions | comments | |
---|---|---|---|---|
0 | webdev | 4515 | 1631495 | 89391 |
1 | javascript | 4385 | 1511631 | 80178 |
2 | beginners | 3074 | 1117579 | 56631 |
3 | react | 1784 | 600528 | 30396 |
4 | tutorial | 1298 | 408933 | 19005 |
Statistics:
count | reactions | comments | |
---|---|---|---|
count | 1676.000000 | 1.676000e+03 | 1676.000000 |
mean | 21.175418 | 7.024169e+03 | 439.421241 |
std | 191.541032 | 6.776828e+04 | 3718.690730 |
min | 1.000000 | 1.220000e+02 | 0.000000 |
25% | 1.000000 | 1.750000e+02 | 12.000000 |
50% | 1.000000 | 3.350000e+02 | 29.000000 |
75% | 4.000000 | 1.012250e+03 | 88.000000 |
max | 4515.000000 | 1.631495e+06 | 89391.000000 |
More than 50% of the 1676 tags used have only one article. The 75% have 4 articles or less ! In other words, a few tags are clearly dominating.
Let's just look at the top 100 tags from the top articles of all time:
n = 100
df = articles_by_tag
fig = go.Figure()
for column in ['count', 'reactions', 'comments']:
print(f'Total {column}: {df[column].sum():,}')
fig.add_trace(go.Scatter(x=df.tag.head(n), y=df[column].head(100), mode='lines+markers', name=column))
fig.update_layout(
title='Top 100 tags appearing in the top 10,000 articles',
xaxis=dict(title='tag', tickmode='linear'),
yaxis=dict(title='%', ticklabelposition='inside'),
margin=dict(l=0, r=0, t=30, b=0),
legend=dict(orientation='h', yanchor='auto', y=1.0, xanchor='auto', x=.5)
)
fig.show()
fig.write_html("plot_articles.html")
Total count: 35,490 Total reactions: 11,772,508 Total comments: 736,470
n = 100
df = articles_by_tag
fig = go.Figure()
for column in ['count', 'reactions', 'comments']:
fig.add_trace(go.Scatter(x=df.tag.head(n), y=(df[column] / df[column].sum() * 100).head(n), mode='lines+markers', name=column))
fig.update_layout(
title='Top 100 tags appearing in the top 10,000 articles - normalized',
xaxis=dict(title='tag', tickmode='linear'),
yaxis=dict(title='%', ticklabelposition='inside'),
margin=dict(l=0, r=0, t=30, b=0),
legend=dict(orientation='h', yanchor='auto', y=1.0, xanchor='auto', x=.5)
)
fig.show()
fig.write_html("plot_articles_normalized.html")
It is interesting to see that the graph is very similar to the one before: webdev
, javascript
, and beginners
rule the place in all categories. The first 16 tags are all either web-oriented, or very general tags (productivity
, career
, programming
). The 17th, python
is the first non-web (but still a scripting language).
To better understand how uneven this is, the first 10 tags account for more than 50% of articles, comments and reactions:
df = articles_by_tag[['count', 'comments', 'reactions']].set_index(articles_by_tag.tag)
(df.head(10).sum() / df.sum() * 100).to_frame(name='%')
% | |
---|---|
count | 57.207664 |
comments | 52.693117 |
reactions | 60.887442 |
But there is a bias here. One very successful article will count for multiple tags. So to avoid that, let's count each article only once. That is, if an article has multiple tags, we will keep only the most "successful" one. The most successful tags are the ones with the highest number of articles, reactions, or comments depending on what criterion we use.
def to_series(sort_column, val_func):
tags_order = articles_by_tag.groupby('tag').agg({sort_column: 'sum'}).reset_index()\
.sort_values(sort_column, ascending=False)\
.tag
df = articles.copy()
df['top_tag'] = df.tags.apply(
lambda tags: tags_order[tags_order.isin(tags)].values[0] if len(tags) else '')
groups = df.groupby('top_tag')
return val_func(groups).rename(sort_column).sort_values(ascending=False)
s_counts = to_series('count', lambda group: group.id.count())
s_reactions = to_series('reactions', lambda group: group.reactions.sum())
s_comments = to_series('comments', lambda group: group.comments.sum())
(To make the graph more readable, I will keep only the first 80 tags for each criterion)
for sr in [s_counts, s_reactions, s_comments]:
fig = px.line(sr.head(80))
fig.update_traces(mode='lines+markers')
fig.update_layout(
title=sr.name,
yaxis=dict(title=None),
xaxis=dict(title=None, tickmode='linear'),
showlegend=False,
margin=dict(l=0, b=0)
)
fig.show()
Let's display the same information, but using a cumulative sum this time. In other words, the y-axis shows the number of articles that are present if we add all the tags appearing on the left of the x-axis.
From the first graph (count), webdev
covers 45% (4.5K) of the articles, webdev+javascript
cover 62% (6.2K), webdev+javascript+beginners
cover 72% (7.2K), etc.
for sr in [s_counts, s_reactions, s_comments]:
srtop = sr.head(80)
fig = px.line(srtop.cumsum() / sr.sum() * 100, custom_data=[srtop, srtop.cumsum().values])
fig.update_traces(dict(
mode='lines+markers',
hovertemplate='tag: %{x}<br>' + sr.name + ': %{customdata[0]:,}<br>cumsum: %{customdata[1]:,} → %{y:.2f}%'
))
fig.update_layout(
title=f'{sr.name} (cumsum) - total: {sr.sum():,}',
yaxis=dict(title=None),
xaxis=dict(title=None, tickmode='linear'),
showlegend=False,
margin=dict(l=0, b=0)
)
fig.show()
Those graphs are very interesting. They clearly show that less than 6 tags account for more than 80% of the total articles/reactions/comments. webdev
in itself accounts for 40% in each category !
def tags_accounting_for(percent, serie):
sr = (serie.cumsum() / serie.sum()) * 100
return ', '.join(f'{k} ({v:.2f}%)' for k,v in sr[sr <= percent].items())
percent = 80
print('Cumsum of the top tags accounting for 80% of the count/reactions/comments.\n')
print('Count: \n', tags_accounting_for(percent, s_counts))
print('\nReactions: \n', tags_accounting_for(percent, s_reactions))
print('\nComments: \n', tags_accounting_for(percent, s_comments))
Cumsum of the top tags accounting for 80% of the count/reactions/comments. Count: webdev (45.15%), javascript (62.63%), beginners (72.87%), productivity (77.15%) Reactions: webdev (49.60%), javascript (66.47%), beginners (76.68%) Comments: webdev (39.45%), javascript (53.25%), welcome (62.03%), beginners (70.71%), career (75.86%), productivity (79.62%)
Here are the top ten tags for each criterion:
pd.DataFrame({
'count': s_counts.head(10).index,
'reactions': s_reactions.head(10).index,
'comments': s_comments.head(10).index
})
count | reactions | comments | |
---|---|---|---|
0 | webdev | webdev | webdev |
1 | javascript | javascript | javascript |
2 | beginners | beginners | welcome |
3 | productivity | productivity | beginners |
4 | career | career | career |
5 | react | react | productivity |
6 | tutorial | programming | discuss |
7 | programming | css | opensource |
8 | opensource | tutorial | react |
9 | css | opensource | meta |
Finally, let's have a pretty visualization of the most pregnant tags using a wordcloud.
from wordcloud import WordCloud
import matplotlib.pyplot as plt
def gen_tagcloud(generate_func):
tagcloud = WordCloud(
width=1200,
height=420,
background_color="white",
collocations=False,
)
generate_func(tagcloud)
#tagcloud.to_file("tagcloud.png")
px = 1/plt.rcParams['figure.dpi'] # pixel in inches
plt.figure(figsize=(tagcloud.width*px, tagcloud.height*px))
plt.imshow(tagcloud, interpolation="bilinear")
plt.axis("off")
plt.show()
The first one is based on the top 100 tags dataframe, and uses the number of articles per tag.
freqs = dict(tags.apply(lambda ts: (ts.tag, ts['count']), axis=1).values)
gen_tagcloud(lambda tc: tc.generate_from_frequencies(freqs))
The second is based on the top 10,000 articles dataframe.
text = ' '.join(articles.tags.apply(lambda ts: ' '.join(ts)))
gen_tagcloud(lambda tc: tc.generate(text))
From the analysis above, I conclude that articles on dev.to are mostly:
Other languages and frameworks that web developers do not use on a daily basis are fewer, or at least trigger fewer reactions and comments (kotlin, php, devops, datascience, machine learning, ...).
The data I have show that more than 50% of all interactions on dev.to happen on the tags webdev
, javascript
, and beginners
. In other words, if you are not a web developer and like to write about technical things, it will be hard to shine on dev.to. This doesn't mean you shouldn't contribute though !