• 0 Posts
  • 18 Comments
Joined 4 months ago
cake
Cake day: June 23rd, 2024

help-circle
















  • Don’t forget the weekdays. The total births in the dataset by weekday are:

    SUN |||||||||||||||||||||||| 5886889
    MON ||||||||||||||||||||||||||||||||||||| 9316001
    TUE ||||||||||||||||||||||||||||||||||||||||| 10274874
    WED |||||||||||||||||||||||||||||||||||||||| 10109130
    THU |||||||||||||||||||||||||||||||||||||||| 10045436
    FRI ||||||||||||||||||||||||||||||||||||||| 9850199
    SAT |||||||||||||||||||||||||| 6704495


  • The color scale is terrible. Here is a more credible chart based on presumably the same data by Social Security Administration, covering 62,187,024 US births (2000-2014).

    Meanwhile, the post’s chart’s actual Reddit OOP is u/plotset, an account made to shill PlotSet.com, a data visualization software.
    They had this to say about the data:

    This data represents 4,153,303 US-born babies only between 2000 and 2014.

    Top 10 Most Common: Sep 12 (0.307%) Sep 19 (0.306%), Sep 20 (0.302%), Dec 19 (0.300%), Sep 10 (0.300%), Dec 20 (0.299%),Sep 18 (0.299%), Aug 8 (0.299%), Sep 26 (0.299%), Sep 17 (0.298%)

    Top 10 Least Common: Dec 25 (0.155%), Jan 1 (0.186%), Dec 24 (0.193%), Jul 4 (0.212%), Jan 2 (0.231%), Dec 26 (0.238%), Nov 23 (0.238%), Nov 25 (0.240%), Nov 27 (0.241%), Nov 24 (0.241%)

    Data Source: Kaggle.com/datasets/ayessa/birthday

    Tools: PlotSet.com

    Note that the “4,153,303” figure is bullshit. It is close to births per year but does not actually correspond to the sum in any of the 15 years, nor the average.

    Also, neither chart normalizes by weekday: 3 of the years in question started on Tuesday and Saturday while only 1 on Friday, causing most of the variation that got amplified by OOP’s terrible color range. (Because of leap years, I made a table of most common starting weekdays for each month; see my other comment. For example, one of the most common birthdays, August 15, was more often Wednesday or Friday than Saturday.) Without doing weird math, one can ensure the effect of weekdays is largely mitigated by using data from 28 consecutive years, which I believe can be pieced together from several good online sources but I’ll be leaving that as an exercise to the reader.