Skip to content

Authors' Guide

Barry Pollard edited this page Dec 3, 2024 · 54 revisions

Web Almanac authors are subject matter experts from the web community who write about the state of the web within the scope of the chapter.

Table of contents

Commitment summary

If you're considering becoming an author of a Web Almanac chapter, you're probably wondering what the level of commitment is. Authors' responsibilities are all different depending on the complexity of the topic, number of coauthors, and how much you have to say. Generally though, you should expect to spend your time in the following way:

  • Content planning: about 12 hours working with coauthors and peer reviewers to brainstorm the scope of the chapter and what stats/metrics you'd need from the HTTP Archive dataset. The planning phase may vary year to year but will typically occur during May-June.
  • Data validation: about 8 hours working with data analysts to validate that the results pulled from the dataset align with your expectations. This phase is meant to catch analysis bugs early so we can rerun queries and ensure that your chapter is methodologically valid. This phase typically occurs in July-September.
  • Content writing: about 20 hours working with coauthors, peer reviewers, data analysts, and editors to write and revise the contents of the chapter. This phase typically occurs in October-November.

The total commitment is approximately 40 hours of work over 6 months.

How to join

The 2021 project is underway and we're actively looking for authors! Browse the list of open chapters and comment in any chapter-specific issues that interest you.

Principles

Use "percent of websites/requests" rather than "percent of traffic" when measuring adoption. The HTTP Archive dataset treats each site equally, with no concept of site popularity, so we should use metrics relative to the sample size. For 2021 we have access to the site rank from CrUX but that still doesn't indicate anything around traffic.

Prefer HTTP Archive stats but don't feel limited. If your chapter requires a particular metric that the HTTP Archive dataset is unable to quantify, let's make a note of it and try to find another data source that includes it. For example, lab data is notoriously unrepresentative of real user perceived performance. So in the Performance chapter of the UX section, it makes more sense to use the Chrome UX Report dataset. When possible, do your own research on public datasets.

The glass is half-full. Sometimes we're drawn to shine a light on the fires burning in the tail of the web, but let's not forget about all the good stuff done by folks who are making active efforts to improve their UX. For example, we should highlight the positive adoption rate of HTTPS and not just dwell on the 20% of websites without it.

Look for country-level insights. The HTTP Archive dataset itself does not distinguish by country, but it's possible to join with the Chrome UX Report's country-level datasets. Ask yourself if there are any trends in emerging markets that would be worth investigating. Would country-specific conditions like slow connectivity be an interesting angle to explore? Or maybe there are country-specific factors that affect things like web font usage.

Writing the chapter

You can think of your chapter as a research-based blog post about the state of "X" in this year, where X is the topic of your chapter. You are the subject matter expert providing an interpretation of the HTTP Archive results so readers can make sense of the data and learn about how X is used in the wild.

Like any blog post, the number of words or pages depends mostly on how much you have to say. If you have a lot to talk about, feel free to make it longer. If your chapter only has a few metrics and the analysis is cut and dry, it's ok for it to be shorter than average. We're not writing a book (despite the terminology) so don't feel like you need to be prosaic or longwinded for the sake of filling the page.

Narrating the results

What makes the Web Almanac different from the data we're already showing on httparchive.org is that this project has the ability to put the data in a personal context. By framing the results in experts' interpretations and experience, readers are better able to understand what the results mean for the state of the web and why it matters. For example, it's one thing to read a stat that the 90th percentile of JS bytes served to websites is over 1 MB. It's so much more insightful to have JS experts explaining what it means for 1+ MB of JS to be served to users: how that might affect the UX, how it might hurt mobile users' data plans, etc. And anecdotally, what are developers doing to allow JS payloads to get so big? Having authors share their experience and give an interpretation of the results allows for a richer experience beyond the data.

Show and tell: visualizing data

In addition to explaining what the results mean, leave placeholders for visualizing the results so readers can see it for themselves. Make a note of what kind of data viz would help tell the story and designers/developers/analysts will help build it. For example: <insert a bar chart of metric 10.01 here>. Or if you know how you want it to look, you can drop in a placeholder image. The final versions of the data visualizations will be designed from a common theme so all Almanac charts have a unified and consistent style.

Explaining the research methodology

We will have a site-wide Methodology page that discusses our analytical approach, including the sample population and tools used. If your chapter has any methodological edge cases or exceptions, for example having to query the September 2020 dataset rather than the standard August dataset, that would be important to mention. But don't feel that you need to reintroduce every tool or process used if readers will have already seen it in other chapters or if it's covered in the Methodology.

General principles for technical writing

The Google Developer Documentation Style Guide is an in-depth resource for technical writers. Familiarize yourselves with the "General principles" section, specifically:

If all chapter authors and reviewers follow these guidelines, the Almanac will have a consistent style and unified voice.

However, those style guides are usually used for writing tutorials and how-to guides, so not all the suggestions may be relevant for us. For example they advise:

Use second person: "you" rather than "we."

Which makes perfect sense in explain how to do something ("If you're deleting multiple entries at a time ..."), but less so for us, where we are speaking based on our, collective research ("we have analyzed...").

It may also be relevant to use "I" or "this author" to indicate a personal (particularly controversial) opinion that is not backed by the data from the Web Almanac ("...despite this often being the recommended approach, this author believes something completely different due to the fact that...").

Submitting your chapter

Once you've written your chapter you should submit the chapter in a pull request. Each chapter is written in Markdown (GitHub flavor)

You can download it from Google Doc as Markdown: File -> Download -> Markdown.

Metadata to add at the top of your chapters.

The first few lines are a set of metadata about the chapter, in metadata: value format. The Example for the 2019 HTTP/2 chapter is shown below:

---
#See https://github.com/HTTPArchive/almanac.httparchive.org/wiki/Authors'-Guide#metadata-to-add-at-the-top-of-your-chapters
title: HTTP/2
description: HTTP/2 chapter of the 2019 Web Almanac covering adoption and impact of HTTP/2, HTTP/2 Push, HTTP/2 Issues, and HTTP/3.
authors: [tunetheweb]
reviewers: [bagder, rmarx, dotjs]
analysts: [paulcalvano]
editors: [rviscomi]
translators: []
discuss: 1775
results: https://docs.google.com/spreadsheets/d/1z1gdS3YVpe8J9K3g2UdrtdSPhRywVQRBz5kgBeqCnbw/
tunetheweb_bio: Barry Pollard is a software developer and author of the Manning book <a href="https://www.manning.com/books/http2-in-action">HTTP/2 in Action</a>. He thinks the web is amazing but wants to make it even better. You can find him tweeting <a href="https://twitter.com/tunetheweb">@tunetheweb</a> and blogging at <a href="https://www.tunetheweb.com">www.tunetheweb.com</a>.
featured_quote: HTTP/2 was the first major update to the main transport protocol of the web in nearly 20 years. It arrived with a wealth of expectations&colon; it promised a free performance boost with no downsides. More than that, we could stop doing all the hacks and work arounds that HTTP/1.1 forced us into, due to its inefficiencies. Bundling, spriting, inlining, and even sharding domains would all become anti-patterns in an HTTP/2 world, as improved performance would be provided by default. This chapter examines how this relatively new technology has fared in the real world.
featured_stat_1: 95%
featured_stat_label_1: Users who can use HTTP/2
featured_stat_2: 27.83%
featured_stat_label_2: Requests with HTTP/2 prioritisation issues
featured_stat_3: 8.38%
featured_stat_label_3: Sites supporting QUIC
---

## Introduction
HTTP/2 was the first major update...

This meta data is used to generate the chapter and also display the featured chapter section for this chapter on the Web Almanac Home Page. Please include a short bio for each author as shown. Reviewers, analysts and translators don't get bios (sorry!).

Metadata field descriptions

  • title: The title of the chapter. This will be used in the H1 of the page, document title, and social metadata. Also listed in the chapter issue.
  • description: A short description of the chapter for SEO and social sharing. Try to keep it to 50-160 characters.
  • authors: Contributor IDs for each author. Authors should be listed in order of their contributions, starting with the lead author. Authors who write/contribute more to the chapter should be listed before those who contribute less as a courtesy. If authors other than the lead roughly contribute equally, list authors alphabetically by last name. Contributor IDs are the property names in the contributors field of the annual config file, for example rviscomi in 2020.json.
  • reviewers: Similar to the authors field, list reviewers in order of contributions or alphabetically.
  • analysts: Similarly, list analysts in order of their contributions or alphabetically.
  • editors: Leave this blank initially but all chapters will be extensively edited for typos and consistency across chapters and here's here we give the editor credit.
  • translators: You get the idea.
  • discuss: NOT USED FROM 2021 ONWARDS. This was the discussion topic ID for the chapter. Discussion topics are created by the project leads and live in the HTTP Archive forum. Feel free to leave this blank if you don't know your topic ID. For example, the 2020 CSS chapter URL is https://discuss.httparchive.org/t/chapter-1-css/2037, which has a topic ID of 2037.
  • results: This is the public spreadsheet containing all of the results for your chapter's analysis. This is listed in the chapter issue.
  • <author_id>_bio: Bio for each author in the authors field. Your bio is useful to help readers understand why you're qualified to write this chapter, so focus on your professional experiences but feel free to personalize it. Feel free to add links (in either HTML or Markdown) but be aware that all of your social URLs are also displayed here based on your contributor file metadata, so no need to repeat it. This is shown at the bottom of the chapter.
  • featured_quote: Pick a line from the chapter (lightly edited is ok) that really captures its essence. This is used on the home page to feature the chapter and draw readers in, so make it hooky and interesting. Try to limit it to 200 characters so it will fit in a tweet with a link. We may explore options to actually make it easier to tweet it from within the chapter.
  • featured_stat_[1,2,3]: A numeric value that will be shown in large font on the home page to draw readers in. For example if the stat is "99% of pages include the color blue" this value should only be 99%. Up to 3 featured stats supported.
  • featured_stat_label_[1,2,3]: The corresponding description to be down with the stat value above. In the example given, the value label should be Pages that include the color blue.

Formatting information for your chapter

We edit the chapter in markdown and then will need to convert it to HTML.

Below are some tips and tricks for the markdown:

Headings

After the metadata your chapter should start with in the Introduction heading as a second level heading (## Introduction). There is no need to include the Chapter Title, Hero Image nor Table of Contents - these will automatically be generated.

Paragraphs

Please include paragraphs on single lines and do not format to 80-character widths. This makes tracking edits more difficult and leads to different line numbers in translations. The Web Almanac can deal with very long lines 😀

Good example

## Introduction

This is sentence. So this this. And this.

This is a new paragraph.

Bad example

## Introduction

This is sentence.
So this this.
And this.

This is a new paragraph that
has been formatted into fixed
widths.

Quotes

Please use plain quote marks: ', ". Typographic or "curly" quotes (“like these”) are automatically applied by our build process. If you've already used them, then please do a search and replace to replace them with plain quote marks. They're easy to manage in code editors and GitHub. For the type fanatics, we promise they will be converted! 😀

Asides or note

Use <p class="note">...</p> for notes and asides. These are currently written in italics.

Links

Markdown has a special link syntax ([link text](https://www.example.com)), but can also use HTML (<a href="https://www.example.com">link text</a>). Confusingly we use both! But there is a good reason, I promise.

We have translators who translate the chapters. One way we try to highlight external links that are in English with an (en) symbol.

Translated text with an external link highlighted with an (en) to show the linked resource is in English

To do this external links should be written in HTML with a hreflang="en" attribute.

<a hreflang="en" href="https://www.example.com/page">archivos individuales y más pequeños</a>

Unfortunately markdown does not allow you to add attributes so we cannot use that, hence why we need to revert to HTML links for those external links.

Note that there are some exceptions:

  • Some sites (e.g. MDN, Wikipedia, web.dev, developer.chrome.com, developers.google.com) have translations available in many languages (though admittedly the latter use machine translations, with limited human input). Therefore we exclude the hreflang="en" attribute and so can use standard markup for these.
  • Internal links to other parts of the Web Almanac do not need the hreflang="en" attribute, as we aim to have translations. As this is a volunteer effort not all chapters are translated to every language, but more are added all the time, so we live in (perhaps overly optimistic!) hope that they will be eventually!

We can help auto-convert Markdown links to HTML links with a bit of magic regex, so don't worry about this one too much. But thought worth explaining in case you were confused looking at any existing chapters.

Images, Figures and Tables

Please see our Figures Guide as to how to markup figures and images.

Images should be included in the repo in the src/static/images/{year}/{chapter} folder and each image should be given a meaningful name (e.g. durable-storage-estimate-usage.png rather than 13_01.png).