Is Drake doing alright?

With all the crazy shit that’s going on in the world, it’s no surprise that I often wake up in the middle of the night worried. Like any normal person, I’ll sometimes rise at 3am, sit up in bed, feel momentarily burdened by the silence of the slumbering city, and then whisper a question to an unseen force in the darkness: “…Do you think Drake is okay?” I mean, when was the last time he called?

Ever since I left the city…And now with a new album/mixtape/play list out, well, we might be able to get another little glimpse into his sad, sad, overstuffed box of feelings that he carries around with him like an empty gold cup. With a bit of R and Lexicoder and some web scraping of, maybe we can start wrestling with some important questions.

Questions like: Seriously Drake, how you doin’? Are you okay?

I mean, it’s completely reasonable to think that maybe the weight of the world is getting Drake down – and if so, we should probably be able to see that in his lyrics. Are they getting sadder?  Are his albums getting sadder over time, or what? WHAT’S GOING ON DRIZZY?

Drake’s Saddest Songs

One thing that is for sure, if Drake’s got enemies – I mean got a lot of enemies – they’re not draining him of any energy. I collected 275 songs from, 191 of them were on albums and got included in the analysis here. So Drake is working hard, putting out a lot of music each year.  But  which one of Drake’s songs is the saddest? Well, using the Lexicoder Sentiment Dictionary, it would appear that that most negative song is: “Digital Dash” from What a Time To Be Alive.

With a sentiment score of -46.0, it is a clear outlier (the average sentiment score for all Drake songs is 4.9, and the average score for this album is -2.2). It’s not clear if this is the saddest song in Drake’s album catalogue (the dictionary captures negativity, not sadness), but it’s definitely one of the darkest.

A quick note on methods, all we are doing here is running a dictionary of negative and positive sentiment scores against a cleaned up dataset of Drake’s lyrics. The overall sentiment score is just the difference between negative and positive word counts. We can then plot those over time, by album, or look at other measures if we’d like.

Fortunately, it looks like Drake’s been getting happier as his career has gone on (sentiment trend overlaid below). I would love to plot these scores against his album sales … so maybe that’s a next step.

As Drake has a tendancy do things real big, bigger than you ever done it, it’s no surprise that each album has a few outliers – both negative and positive.  When we drop the outliers , it looks like the saddest/darkest songs in Drake’s catalog are “Own It” off “Nothing Was The Same”, and “Say What’s Real” off of “So Far Gone”, both with a sentiment score of -19.0.

Drake’s Saddest Albums

This doesn’t quite tell us the full story though. When we look at overall sentiment per album, we can see that Drake … well… it seems like Drizzy wasn’t in a very good place for a few years….


But by the time Views was released, he seemed to have gotten things under control. Actually, let me rephrase, he was on top of the fucking world: Views was extremely positive. Like off the charts positive. Perhaps Drake was too strung out on compliments, and he overdosed on confidence, because More Life, the follow up, is much more subdued – back to the sentiment levels of Take Care.

So what does this mean? Are we about to enter another Drake slump? Will his next album be another dark pit of depression, introspection, and sadness, or is this just a momentary blip?

If we look at the number of positive and negative songs per album, it looks like the overwhelming number of songs on the last album are still positive (though these haven’t been weighted yet, this could just lots of short interludes with positive counts). So even if the overall tone of the album may be more negative, this may just be concentrated in a few really negative songs.


Album Negative Songs Positive Songs Diff
2006 – Room for Improvement 3 11 8
2007 – Comeback Season 5 15 10
2009 – So Far Gone 4 12 8
2010 – Thank Me Later 0 11 11
2010 – Young Sweet Jones 1 10 9
2011 – Take Care 10 10 0
2013 – Nothing was the Same 6 8 2
2015 – If You’re Reading This It’s Too Late 9 9 0
2015 – What A Time To Be Alive 4 5 1
2016 – Views 0 12 12
2017 – More Life 6 14 8


Of course this is a first cut, if this had serious implications, I would spend a lot more time cleaning the data and maybe working on a Drake-specific sentiment dictionary. You’ll also note that I haven’t actually looked at the lyrics at all to see if this can be confirmed, but I suspect that it would. It’d be great to have someone who really knows his catalog to help sort through this. But from my knowledge of Drake’s catalog, this seems pretty accurate.

Tentative conclusion: Drake’s probably doing just fine, but this could be the beginning of another exploration of the darker sides of his feelings. Not particularly surprising for someone who started at the bottom, but really, I think I like who he’s becoming.


On a whim, I just checked to see what the most common words were in the dataset. What was the most common word that Drake says in his music?

That word was I

That word is “I”

I’m on twitter @markdaku

The Republican party’s position on health insurance is fundamentally flawed


Today on NPR there was an interview with Republican Representative Jeff Fortenberry of Nebraska about a town hall he just held, and about how difficult – but important – that town hall was.  He told a story about two constituent letters – one from a farmer named Scott who had trouble paying his increasing premiums because of the current price of corn, the other from “Catherine” a woman with a pre-existing condition who could finally afford coverage under the ACA.

Fortenberry told this story in order to argue his point – that it’s not about either of these constituents being correct, but rather than they are both right. Because both constituents have valid claims, they should both be addressed.

Fortenberry sounded eminently reasonable. He was not out to hurt people, he honestly, wanted to help all his constituents, and he honestly believed that some variant of the Republican plan was the best way to do it.  I wanted to engage with starting from the assumption that he was being genuine in his beliefs and objectives. So the question is, does a solution that emphasizes, freedom, choice, and personal responsibility make sense?

Fortenberry’s position implicitly assumes that it is within the control of each individual to alter their health-coverage position. 

The reality is that neither constituent is in complete control of their ability to pay for health insurance.  “Scott” is at the whim of the corn market, while “Catherine” is at the whim of the insurance companies (from here on I am creating fictional characters, I know nothing about these people).  The assumption is that by removing government restrictions, both will be better off because there will be more (and cheaper) options available on the market and they will be able to pursue the health coverage that is most in line with their preferences and needs. However, without any regulation there is no guarantee that this coverage will actually exist in a meaningful way: the easiest way for insurance companies to reduce their costs (and thus improve profitability) is to exclude those at high risk of drawing on insurance – those with pre-existing conditions.  Individuals like Catherine will be left out or priced out because it is not in the financial interest of insurance companies to cover them.

But let us assume that the coverage does exist, and that it doesn’t exclude people with pre-existing conditions, and it is currently at a price of $100.  Let us also assume that Scott and Catherine can both afford to pay the same amount for health care, say $90 a month.

For Scott, the assumption is that for him to afford health insurance he merely has to bring the amount he can afford up $10.  For a healthy, employed, individual we can say that the amount of money that he can afford to spend on healthcare is some function of net income. And net income is some function of the individual’s capital, labor, and spending.  It makes sense then, that if Scott wants to afford health care, Scott can do specific things to accomplish this. He can reduce spending on other goods by $10, he can work more and earn the additional $10, or he can dip into his savings to cover the added expense.  If this isn’t enough, we can assume that Scott – given his desire to have a basket of certain goods that include health insurance – will pursue employment elsewhere or plant a different crop that will net him that extra $10.  In short, we can consider Scott to be an agent who has (at least some) control over his income.

Let’s assume that the same thing applies to Catherine, and let’s for the moment assume that the insurance market has not discriminated against her because of a pre-existing condition.   Her ability to afford health insurance is the same function: some combination of capital, income, and spending.

However, unlike with Scott, Catherine’s access to capital, control over her spending, and her income is mediated by her illness.  While Scott has been saving 10% of his income every month since he first started working, Catherine has had to devote that money to medication.  She still saved as much as she could, but those savings were all wiped out last year when she had to get surgery related to her condition. Catherine doesn’t have any disposable income that can be re-routed.  Her money goes to food, housing, transportation, and her medical bills.  As it stands, she currently spends $90 a month on medical bills, and even if she were to reroute that completely to insurance, she still can’t free up $10 more to afford the insurance.

Like Scott, Catherine can also theoretically shift jobs or alter the focus of her current work.  But we must remember that for Catherine $90 a month is automatically directed to medical expenses, while Scott has much more flexibility over that money.  Currently, Scott saves $75 a month and spends $15 on lottery tickets and lattes. Scott can easily reduce his latte intake, or his savings rate, if he values health insurance enough.  Catherine has no flexibility here.  Most importantly, however, Scott can afford to take a day or two off if he wants to.  Or he can afford to buy some soy beans as an experiment to see if they grow well.  He can also afford to buy double paned glass windows from his savings to reduce his monthly bills by $10.

Catherine does not have these options. She cannot quit her job to try to get a new one, she likely cannot afford to take a day off to go to the interview or to look at cheaper apartments, she cannot dip into savings, and she cannot adjust her spending.

Already it should be clear that while these claims look to be the same on the surface, they are effectively much different. Scott’s ability to engage with the insurance market as a free and rational economic actor is far less constrained than Catherine’s – and that matters tremendously.

Now we haven’t addressed the fact that Catherine is a woman, so on average, she is going to earn less than Scott in any profession. She will also not have access to the same opportunities.

Catherine is also a woman of colour, so the disadvantages she faces as a woman in the labour market are compounded even more. In short she has always earned less money in jobs that she is overqualified for than her white, male peers.  Due to her sex and race, Catherine has a history of being chronically underemployed and underpaid.

Since her wages have been lower than they should have been from the start, Catherine’s options for where to live have been limited. She currently can afford a house near the highway, but that means inhaling exhaust fumes day and night, which exacerbates her asthma.  There are also no grocery stores within a ten minute drive that stock fresh produce, so she tends to eat a lot at the fast food restaurant on the corner, which is still cheaper than buying the food at the corner store. She knows it’s not the best food, but she always opts for the ‘healthier’ options like the chicken salad, not realizing that even though it’s branded as ‘healthy’ it is still almost 1000 calories and contains more than double the amount of sodium she should receive per day.

Catherine’s ability to pay for her health care, and her ability to control her health (through diet and location), are largely independent from her efforts to do so. 

No matter how hard Catherine tries to adjust the income, savings, spending function, she is going to have a much harder time of doing so than Scott.  Of course it is possible to do, but it is not likely that she will be successful.

And this is the heart of the failure of the Republican position.  Like many positions I’ve seen them take lately (e.g. security, trade, literally anything else) the tendency is to reduce the issue down to something very simple and without nuance.  And yes, on the surface, it looks simple.  Both Scott and Catherine have seem to have an equal opportunity to adjust to the market – or to not participate in the market. So by freeing that market, both Scott and Catherine will be better off. This may or may not be true in the ideal situation embraced by the Republicans, but it is most definitely not true of the picture that I have presented here. A picture, I would argue, that is more in line with reality.

If it is accepted that external social, political, environmental, and biological differences exist between individuals, then we have to ask if  these differences:

a) affect an individual’s tendency to have poor health, and/or

b)  affect an individual’s ability to exert control over their income, and/or

c) affect an individual’s ability to exert control over their spending.

If any, or all of these, are true, then a system that depends wholly on an ethic of personal responsibility and freedom is fundamentally flawed.

As an educator, I see it as my duty to complicate this link between agency, structure, freedom, and responsibility. Students need to understand that there is a difference between what Berlin would call negative liberty, the freedom from something (like government regulation), and positive liberty, the freedom to do something (like actually be able to afford health insurance). They must understand that one’s education, neighbourhood, employment, diet and transportation options are not entirely within their control – that if you start a 100m dash a kilometer behind me, no matter how much you train you will never catch up.  Hard work and personal responsibility are important, and they should be rewarded, but they are not the only things that contribute to outcomes.

During the town hall, Rep. Fortenberry asked the crowd two questions:

“How many people feel that health care is a right?”
“How many people feel that health is a responsibility?”

The first was met with cheers, the second with a few claps, but much less enthusiasm. I feel this is misguided.  Fortenberry is not wrong, health is and should be a personal responsibility – but not entirely.  You cannot blame the residents of Flint, Michigan, for the fact that their children were exposed to high levels of lead, and you cannot blame the residents of Beijing for having asthma if they are too poor to afford an effective air filter.

We would be better off to think of personal responsibility as being banded – something that only applies within a certain range of outcomes that are determined by forces outside of our control.  Yes, Catherine has a choice when choosing what to eat.  But her choice is not between a grass-fed grilled chicken salad or a greasy pizza (like Bob’s); it is between a McDonald’s Burger Combo, and a KFC salad with deep fried chicken, or a bag of chips from the corner store.  Maybe the salad is healthier, and maybe she chooses it every time, but it is not going to have the same effect on her health and well-being as it would if she had Bob’s options.

Responsibility and choice require power and agency.  If someone has the power, knowledge, and agency to choose to eat a healthy salad, but instead opts for a bacon double cheeseburger (like they do every day), then I have little sympathy when their health falters, and I have trouble finding objections to charging that person more money for health insurance.  However, the idea that everyone has an equal amount of power, knowledge, and agency, is a dangerous and flawed assumption to make. Once the social, political, environmental, and biological determinants of disease and wealth are taken into account, it is clear that individuals don’t have as much control over their health or wealth than we might initially assume.

The problem with this perspective is not that it requires personal choice or responsibility; indeed the impulse towards more freedom for individuals is an attractive one. The problem with this perspective is that it assumes – and requires – a level of freedom that simply does not exist. And any plan that is premised on freedom and control over one’s own actions, when applied to a population that doesn’t have this freedom and control, is bound to fail. 

We can expect the young, the wealthy, the white, and the men in America to do well under these proposals. And if you are all of the above, then well, you’re probably already a special advisor to the President and don’t need health insurance.  But for everyone else, to get decent and affordable health insurance you will need to exercise power that you do not have, you will have to exercise agency on processes over which you have no control, and you will need to take risks that you cannot afford to take.  You will be expected to take responsibility for your health, but you will not have the opportunities or the tools to do so. And when you die, those who do have insurance will shrug their shoulders and sigh “it’s so sad about Catherine… if only she would have worked harder and not eaten at KFC all the time.”

How I stay (kind of) sane as an academic

TL/DR; I find the following tools extremely helpful in dealing with what I used to feel was an overwhelming amount of information and responsibilities. The tools work very well together, and help to collect, sort, store, and share information without undue effort.

  • Evernote – For notes, documents, lists, etc.
  • Dropbox – For other working files, datasets, etc.
  • Pocket – For capturing new articles, videos, podcasts, etc.
  • Buffer – For automating social media posts
  • Bullet Journal – For my day-to-day organization, calendars, etc.
  • Pomodoro Technique – For actually getting shit done.

Basically, everything that I want to keep or need to reference later goes directly into Evernote.  Anything that I come across that looks like it may be interesting gets thrown into Pocket to review later. Anything worth sharing gets pushed to Buffer. I use the Bullet Journal to plan my weeks, and I move individual projects forward using the Pomodoro technique. Make sure you install the web clippers and the phone/tablet apps.

The Problem(s) with Academic Work

I am a very lazy person.  Don’t get me wrong, I want to do very many things. I have a lot of ambitions and goals, and I have so far done alright in reaching them (e.g. I had a cheesecake for breakfast today). But really I want to achieve my goals AND not have to get out of bed before noon.  I want to be successful in my career while also having a lot of time to goof off, travel, go to concerts, and learn things I want to learn.  The only way to do that is to find a way to excel at my job in the most efficient and least stressful way possible.  I want my work to be on auto-pilot as much as possible so I can spend my spare time reading China Miéville, not reviewing journal articles.

The main problem is that academic work is overwhelming, and the great blessing of academia – that every day is a Saturday and that you do what you love – means that you what you love is also work, and you work on Saturdays.

As an academic, we face several major problems that can lead to being overwhelmed.  Here’s how I see it, and the system that I have come up with to better deal with them.

Too much information

It can be overwhelming.  Not only are there the key journals to keep up with, but also there’s stuff that’s tangentially related (or just interesting), and still there’s stuff in the non-academic world that is interesting that we want to keep track of and read eventually.

One particular issue is that there is a divide between when we find an article and when we have time to read it. So we tend to just keep open tabs, or have piles of print-outs, or some other version of an extremely cluttered desk.

What is needed here is a way of capturing interesting articles for evaluation and reading (and maybe filing) later.

Too many priorities

There are so many things that we need to do to keep on top of our work, that we often neglect the important things that would help us do better work because there is no sense of urgency to them.  So we come across articles or books that are important and interesting but not urgent (like that Foucault that’s been sitting on your desk for the past three years…), and we don’t know what to do with them. Foucault stares at you for a few years from the pile on your desk, you feel guilty (as you should), and then you just bury him under other pieces of paper or put him back on the shelf.

But we also don’t want to slip Foucault into a pile of grading, because that shit needs to get done today – it’s urgent and important. So what’s the solution? What we need here is a way to capture those things that are important and interesting, but not urgent, and work through it.  We need to somehow make the important urgent.

Too many projects

I must have something like 30 open projects at the moment, ranging in size and scope from a book to a blog post.  And I want to do all of them, but with so many it’s almost impossible to figure out what needs to be done and when. In short it’s hard to prioritize projects, so most of the time they will sit there, and most of us respond to external prompts – e-mails from co-authors, or journals, or requests from students for letters.  These external prompts take what is important and make it urgent, which is why we respond.  So again, we need to find a way to make these important projects urgent.

Too much information, again

Not only is there too much information out there that we need to process, but we also hold onto a lot of information that is relevant to our projects and our work, and there’s no standard way of keeping track of all this.  We will often simply dump everything in one folder, or conversely spend an excessive amount of time trying to organize folders to put things in the most logical place so we can find it later. Neither of these methods are particularly effective.

What we need is to organize files and notes about projects in a way that makes sense to how your brain works with the information.

Too many hats

We are expected to be teachers, researchers, mentors, community members, and in a lot of fields, we’re also expected to be at least somewhat publicly engaged.  And while it might not be critical in all fields, there’s definitely an advantage to being active on social media, and that takes time and effort.  I know like one person who can sit on Facebook and Twitter all day long and has a successful career. The rest of us are screwed. So what we normal people need is a way to have an active professional social media presence with the minimal amount of work.




There are other issues we face, like imposter syndrome and horrible fashion choices, but I’ll start with the above.

So here are the tools that I use.

  • Evernote – For notes, documents, lists, etc.
  • Dropbox – For other working files, datasets, etc.
  • Pocket – For capturing new articles, videos, podcasts, etc.
  • Buffer – For automating social media posts
  • Bullet Journal – For my day-to-day organization, calendars, etc.
  • Pomodoro Technique – For actually getting shit done.


You should use Evernote to capture everything.  EVERYTHING.  You won’t realize how valuable it is until you dive in fully.  So you will want to install the Evernote web clipper, and you will want Evernote apps on your phone and tablet, if you have those things. You will also get an e-mail address so you can e-mail things directly to Evernote (useful to BCC this address on important discussions you’d like to archive). There are also great apps for scanning receipts, documents and business cards directly into Evernote, and it is simple to pull PDFs straight in. Anytime you want to remember or keep anything, put it in Evernote and then forget about it until you sort your inbox.

Evernote has an incredible search function.  It will search through the text of PDFs you add, and other documents, and has the ability to search hand written notes. Which is fucking crazy, but yes, you can write your notes by hand, scan them to Evernote and they become searchable.  So this is why you should put everything in here, because once it’s captured in Evernote it is actually retrievable.  My old handwritten notes from my undergrad are not.  They are scattered in various notebooks, and I can’t search for “Plato” and get my notes from those lectures.  If I scanned them to Evernote, I could.

Capturing isn’t quite enough, you’re going to want to do a minimal amount of organizing – nothing ridiculous, but you will want to put notes into a notebook that represents one of your projects. That’s about it.  You want to have one notebook per project (a project being loosely defined, it can also be an area of interest or responsibility, or a person, whatever the situation calls for), and then you want anything to do with that project gets put into that notebook.  It sounds stupid and simplistic, and it is, but it’s horribly effective. Every meeting note, every interesting related webpage, every academic article related to the project are in one notebook, so when you open the notebook all of the important information is there for you.

To make this a little more powerful, each notebook that I have has a “next-steps” note which outlines the next steps that I need to take if I want to move this project forward.  I try to always have a list of the smallest possible tasks that need to be done for all projects.  Because you can schedule reminders, you can tell Evernote to remind you about this project one month before the deadline, or whatever you want.

The trick here is that you want each note to fit nicely into one notebook. Do you keep putting notes into multiple notebooks? Maybe there’s a new notebook that should be made to hold those notes.  For example, instead of putting articles about teaching strategies in each of the notebooks for the classes that you teach, try having a notebook called “teaching-strategies”.  You want the notebook choice to be completely brainless and obvious.  Filing should not employ any of your brain power.

Part of this is for easy retrieval – Evernote has an excellent search function, but you often don’t know what you need to be searching for.  If you file relevant material together, it becomes more useful to you when you are working on that project.

The other powerful thing about Evernote is that it is accessible across all of your devices.  So you can enter notes on your phone, read articles on your tablet, and everything is synced and organized.  You can share notes with other people – which is great for collaborative projects and trip planning.

Of course you can organize your notebooks however you want, but after trying a few different systems, I have settled on every project getting its own notebook, some people getting their own notebook (e.g. interns, students I write letters for, etc.), and some reference material getting its own notebook (e.g. my master bibliography). These notebooks will normally resolve themselves into stacks by themselves, so organization becomes pretty easy.  Here’s how Evernote looks for me, maybe something similar will work for you:

  • _inbox – all new notes start here before being processed
  • admin – includes notebooks for general life admin stuff, receipts and tax info, etc.
  • bib – all of my pdfs for my master bibliography go in here (more on this below)
  • classes – each class I am designing / teaching / have taught gets its own notebook
  • hobbies – I actually have time for some now, so I have notebooks on each of them
  • projects – this is the largest stack – each project gets a notebook, regardless of how major or minor the project is.  A short blog post I’m working on? Notebook.  An article? Notebook.  The idea is to keep all of the relevant information in the right place so you can pick up on any project at any time.
  • projects-finished – I move finished projects here so I still have access to them but they do not clutter my projects stack
  • students – I keep notebooks on students I write letters for or supervise – everything just gets dropped in here and it makes letter writing so much easier
  • to-do – I keep a lot of lists, so this is where I keep them.  An important one is my articles notebook, which stores PDFs of articles that I have not read yet.  More on this below.
  • service – technically could be in the projects stack, but I like to keep them separate
  • someday – these are ideas for projects, or info on things that I will look into later.
  • travel – also could be in the projects stack

So now the big trick is to just put things into the right folders, and that’s about it.  When your intern sends a draft of their paper, save it to their notebook – then it’s there and you can work on it within Evernote and have a record of the steps that happened in a project.

Capturing everything in Evernote goes a long way to dealing with the amount of information we are faced with, but it doesn’t solve all of our problems.  Fortunately, when Evernote is combined with a few other tools, things become even easier.

Dropbox (or Google Drive)

I need this less and less the more I put things into Evernote (and the more I move to Git), but there are still things that don’t make sense to be stored in Evernote (e.g. my Lexicoder development environment or a series of datasets for a consultancy project).  For things like this I simply have Dropbox mirror my Evernote notebooks.  So while the Lexicoder notebook stores my notes on what features I want to add, and what bugs I need to address, the Dropbox folder stores all of the actual files for the project (it’s also on GitHub, but that’s a different post).

Mirroring is important. It means I don’t need to think “where’s that dataset?” when I’m working on my sick-leave-coverage project.  I know it’s in the folder “Dropbox/projects/sick-leave-coverage/”  This is much better than having a folder of datasets, or whatever you usually do.

One of the most valuable things about using Dropbox and Evernote together is that they sync and store things in the background.  You do not need to worry about manually backing up your work – these services will do it for you, so you can get on with actually doing whatever it is you do.

It’s not all roses though, and the big negative here is privacy.  Neither service is perfectly secure.  Yes, your work is password protected (and Dropbox provides two-step verification), but that doesn’t mean that Dropbox and Evernote are locked out of your work.  If you want an added layer of security Evernote allows you to encrypt notes that you add to it (though they may still have cached versions of earlier notes).  The better option is to encrypt sensitive notes using GPG before adding them to Evernote, if you are so inclined.  There are also tools that allow you to encrypt files before they enter your Dropbox, which may be a good solution if you are storing sensitive data.

One final tip here is to set Dropbox to automatically upload your photos from your phone. Your phone will break or get lost and you will lose your photos because you never back them up, just let Dropbox handle it for you. (You may notice that I am really focused on automating as much as I can so I can focus my actual attention on other things.  I don’t want to have to think about backing up my photos, I just want to know that when my phone explodes, that everything was backed up.)


It’s just a read-it-later app, but it’s amazing how much of an impact Pocket has had on me.  I use Pocket to capture anything that looks interesting from the internet (including videos and podcasts) to read later.  This is really important, since most of the time when I find an article that seems interesting, it’s not the best time to actually read it.  So I would just let these tabs open up and I would constantly be staring at all the articles I wanted to read, which isn’t particularly useful, and just distracted me.

Pocket grabs the article, syncs it across all of your devices, and keeps an archive of all the articles you’ve read.  So you don’t lose anything that you thought might be interesting, and if you ever want to go back to some article you read before, it’s in your archive.  It’s also really easy to share articles from within Pocket, which is important.

On occasion, something I read in Pocket will be relevant to my work life, maybe a long-form New Yorker article will be perfect for a class or something.  In this case, Pocket’s not the best solution for storing it – but fortunately we already have a system for these things in Evernote, so I just share the article to Evernote with a click and we’re done.

I’ve actually gone one step further and made it easier to move things to Evernote.  I have implemented an IFTTT script to make sure that any time I favourite anything in Pocket it automatically shares it on social media and pushes it to Evernote.  So if something is especially interesting, I can share it and move it into long-term storage with one click.  More on this over at LifeHacker.


Pocket solves the problem of the disconnect between input and processing of information, but if we sit down to read 10 articles and then post them all to social media all at once, we’re not really doing ourselves or anyone else any favours.  Instead of sharing to social media directly, we can use Buffer to fix the disconnect between the processing and the output of information.  In other words, while we may want to read things all on a Sunday, we want to be posting social media somewhat more consistently if we want to build up a professional profile.

Buffer will automatically schedule your posts for ideal times, and spread them across the week. So you can just keep adding to your list, and buffer will post them at appropriate times, spaced out for you without you worrying about it.  It also works across multiple accounts (e.g. Facebook, Twitter, Instagram) and can be incredibly useful in maintaining a consistent and professional online presence.

This is also an incredibly good tool if you are in charge of a professional account, such as your department’s account.  You can spend one hour at the beginning of the week scheduling your social media posts and then forget about it.  Of course you will probably spend much more time on social media, and go nuts.  But with Buffer, anything else you do is gravy – you already have a strong basis in place for a consistent presence.

With Pocket and Buffer, nothing really changes in your day to day life.  You are still reading articles and sharing them on social media, but these apps allow you to gather from many sources into one place, parse things once, and share to many places – all with no real additional work. The benefits are many: a consistent social media presence, an archive of articles you’ve read, the ability to schedule tweets and facebook posts in the future, and the ability to capture articles and interesting things you encounter on the internet without having to divert your attention away from what you’re doing when you first find them.


Bullet Journal

All of this falls apart without an actual system to work with it, and for me I use a bullet journal  to keep track of everything.  The beauty of a bullet journal is that it is flexible and structure-free, that you basically create the modules that you need and tweak them to reflect your work habits.  Once you get into it, it’s a perfect system (because you create it for yourself). I have since ditched my digital calendar and to-do lists for the paper-based journal.

Part of what makes the bullet journal appealing is that it takes time, and you need to write things down.  This means that if you keep on writing “clean the bathroom” every day because you just don’t get around to cleaning the bathroom, eventually you will get sick of writing it and either a) clean the bathroom or b) decide that it’s not important to clean the bathroom and stop writing it down. Writing also helps you remember things, and most important for me, there’s something really satisfying about physically crossing something off a to-do list.

Important in my bullet journal are monthly logs, monthly goals, and daily logs.  You should read up about the bullet journal  here if you want more info on the basic system, but the way I use it is rather straightforward.

I have monthly project goals, which are translated to weekly goals, and then daily to-do items by looking through my Evernote next-steps notes. This allows me to have a very reasonable number of items in my to-do list every day.  Importantly, for reasons discussed below, I try to ensure that each task takes about 20-25 minutes to complete.

I have a monthly log, which has a list of all the things that are important to me (like exercise, reading academic articles that are not associated to a particular project, writing something, etc.)  These things all get an entry on my daily to-do list as well, and importantly, they are tracked.  Every time I do one of these things, I add an X to the log and start to build a chain.  This was Jerry Seinfeld’s method too: your only job – be it with writing, generating new comedy material, or whatever – is to not break the chain.  Tracking it this way is what helps convert an important but not urgent task into something that is important and urgent.

So for example, I have an Evernote notebook of articles – these are academic articles that I want to read that aren’t necessarily related to any particular project, but are interesting or important for me to keep up with.  Every day I have an item in my daily log that says “Read article.”  It doesn’t matter which one, I just have to read one from this list.  When I read it, I enter the bibliography information into BibDesk, and I take short notes directly into the file.  Twenty minutes a day, not a lot of effort, and after a year I will have added 260 additional references to my bibliography – and that’s on top of anything I’ve done for other projects.  Now, also let’s say that I add “Write one page” to my daily log.  That’s a novel. And this is how these super productive academics produce so much. Consistent minimal effort, not week-long whiskey fueled binge writing.  Though I guess you could do both.

At the beginning of the week, I look at my monthly log and see which projects I need to work on. I then consult the next steps in Evernote and schedule these during the week, breaking these tasks down into the smallest possible chunk.  “Write article” is useless, “read for literature review revision is better,” “Read Hall & Taylor” is best. Ideally, this is how they have been recorded in the next-steps note. The result is that my daily logs have all my work that I want to do, all of the important-but-not-urgent things that I want to do, and any appointments or other obligations. And for the most part they are all broken down into tasks that will require 25 minutes or less.

The Pomodoro Technique

All of the above is for nothing if you don’t actually do anything. And it’s easy enough to procrastinate and neglect things, no matter how focused you are.  But this system actually makes not working kind of difficult.

First, I find that my daily to-do lists will generally have about 8 concrete tasks on it – which is about four hours of focused work.  That’s really not that onerous to think about.  And it feels great to check all of these off – especially when you can blitz through your work so quickly.

Second, you’ve chunked things into 20-25 minute tasks. And guess what? You can do ANYTHING for 20-25 minutes.  So set a timer and go. Even if you don’t want to read that article, it’s only 20 minutes. Don’t feel like looking at that dataset? It’s just twenty minutes, stop your whining. This is where the Pomodoro technique comes in. The Pomodoro technique involves doing a task for 25 minutes and then taking a five-minute break.  Then after four of these sessions, you take a longer break. Your work day is now two-hours followed by a break followed by two more hours.  You will move forward eight separate projects or things that are important to you, and then you can get on with doing whatever you want to do with the rest of your day.

This makes it really easy to consistently read outside of my “urgent” list, it makes it easy to write things that have nothing to do with projects on deadline, it makes it easier to learn new skills and methods because, really, 25-minutes of anything is really easy to do. The trick is to just prioritize what you want to spend those 25 minutes on, and then just do them.  You actually have to do some work eventually. Sorry.

What’s crucial here is that you can obviously spend more than 25 minutes on something, and sometimes you will want to, and sometimes you should – but you don’t have to, just be consistent and do this shit every day and you will push it all forward. So what I generally try to do is put those things that I have a tendency to work more on at the end of my list – it may take me 25 minutes to get into working on a dataset or to start coding, but once my brain is there, I will likely want to keep going – so I give myself the opportunity to do that by doing the other things on my list first.

Again, nobody is saying you can’t keep working on your projects later or for much more time if you want to.  Go for it if you want to. But remember that you don’t have to.  You’ve done all the things that you need to in order for your work to progress, so anything else you do you can actually see as doing it for sheer enjoyment, which is (I think) why we all got into this game to begin with, no? The system I’ve set up here is all pretty straightforward and mechanical, as it should be.  The idea here is to move the stuff forward that needs to be done with the minimal amount of effort, stress, and attention so you can get on to more important things, like reading outside of your field, socializing, or god forbid, having a relationship. Hell, work more if you want to – but now you get to do so out of choice instead of stress, pressure, or obligation.


Concluding Thoughts

So that’s my current system.  It’s still being tweaked, but I find that I am doing a lot less work and moving a lot more things forward.  I am not as stressed as I used to be, and I have no worries about missing articles or books, and no compulsion to read them immediately.  I know they’ve been safely stored, and I know where to find them, so when I have time to read for pleasure (which I have much more of now), I don’t waste that time clicking around the internet – I have already gathered the things I’m interested in.

My weeks are really manageable, I’ve automated as much as I can, and I am much less worried about stagnant projects because I’ve got a way now to easily push them all forward.  This system also gives me much more time to react to fires and other peoples’ deadlines.

Most important to me is that 20 hours of my day is mine to do whatever I want with, and I know that even if I spend all that time playing video games, as long as I’ve worked through my daily log, all the important things to me have been handled.  No stress.  And there’s no need to keep that number pegged at 4 hours per day – the amount that you want to commit to focused work is up to you.  Personally, at this moment (as I don’t have any classes to teach), four hours is great. But I also know that I will be able to scale this up on days/weeks/months when necessary, and my other projects and responsibilities will not be neglected.

This combination of tools for dealing with large amounts of information, and the strategy of chunking things into 25 minute tasks, has been incredibly useful for me. The micro-work adds up quickly, and then you can sit around later watching The Expanse without feeling guilty. Which is what you most definitely should be doing.


I wrote this as a response to a friend’s request to describe my system of keeping on top of projects. Clearly, I think too much about this stuff, but I am actually really interested in other peoples’ systems and approaches. So if you have any of your own preferred methods and systems, I’d love to hear about them.  I’m on twitter and other stuff @markdaku

What the HIV/AIDS epidemic can tell us about how to fight Zika (Monkey Cage)

Are some countries — and some political systems — more at risk for bad health and epidemics than others?

Economist Amartya Sen has famously argued that democracies don’t have famines. Here’s his reasoning: Famines result not from failed crops but from failures to distribute food. In functioning democracies, governments are accountable to the public — and therefore have more incentive to prevent catastrophic failures, such as famines, from occurring.

Is the same true for epidemics?

[Read full post at the Monkey Cage]

Favourite Albums of 2015

As is always the case, there are a ton of phenomenal albums that I just discovered (Miguel, Jamie XX, Turnstile, etc.) and a lot of albums that I know should have on my list, but didn’t make the cut. Sufjan Stevens’ Carrie & Lowell is a good example: haunting, beautiful, perfectly executed, but ultimately too sad to listen to over and over.

So here are my favourites, in rough order.

  1. Ryan Adams – 1989 / Live at Carnegie Hall
    This may be breaking the rules, but this combination topped my listens this year by a long shot. And while I don’t think either one would have made my top 5 on their own, something about the two of them together just worked so well. I had never really been into Ryan Adams before (besides his homage to my profession and a pretty solid Oasis cover), but the live double disc at Carnegie Hall is just a masterpiece. Not only a wonderful performance, but the way Adams engages with the crowd and brings you into his life is refreshing and vulnerable. His between-song banter made me more engaged with his music, and then when 1989 came out, it wasn’t simply a collection of covers. It was part a love letter to Taylor Swift, part a love letter to pop music, and part a portal into the sadness that is Adams’ life. More importantly, it made me appreciate Taylor Swift’s 1989 for the pop masterpiece that it is. By reinterpreting and representing Swift’s work as his own, Adams created something new. Curiously, this new thing is not better than the original, it is its own thing – but it also makes the original better, partly because it showcases what Swift’s work does that Adams never can. You listen to Swift and you realize that while she can shake it off, Adams clearly cannot, that Swift’s relationship may never go out of style, but Adams’ already has. So it brings Swift’s pop into a new dimension, provides a new perspective on it, and consequently makes it a better album. So basically, Ryan Adams gave me three excellent albums this year – so he wins – but just barely…
  2. Kendrick Lamar – To Pimp A Butterfly
    Enough ink has been spilled about this album. It is masterful start to finish, and it is wonderfully cerebral. Lamar draws from literature, history, hip hop, and god knows where else to construct songs that are accessible on the surface and then gain depth and meaning the more effort you put in them. University professors are ditching their old syllabi and teaching this album. I have little doubt that this album is one of the most important albums of the decade, and will just get more important with time.
  3. OughtSun Coming Down
    Seeing Ought coerce a church full of seated concertgoers to come to the front and dance in the pews was one of my concert highlights of the year. This album is a worthy follow up to 2014’s ‘More Than Any Other Day‘, and shows maturity and growth without losing the elements of their sound that I love so much. For example, in ‘Beautiful Blue Sky’ we’re treated with a repeated refrain of mundanities: “How’s the family? How’s the family? How’s the family? How’s the family? How’s your health been? How’s your health been? How’s your health been? How’s your health been?” The result is to bring meaning to these phrases before bashing them back down into meaningless small talk through repetition. I can’t say enough good things about this album.
  4. Titus Andronicus – The Most Lamentable Tragedy
    A rock opera. Titus Andronicus made a freaking rock opera. Like Lamar, there is a depth in this album that is worth unpacking, and like earlier work, TMLT invites you into Patrick Stickle’s struggles with depression, anxiety, eating disorders and all the rest. But mostly it’s just good punk music. Not at the level of The Monitor, but damn close.
  5. Father John Misty – I Love You, Honeybear
    I’m not going to lie, part of my love of this album is the weirdness that happened when Father John Misty recorded covers of Ryan Adams’ Taylor Swift Covers in the style of the Velvet Underground and then had a drug fuelled dream where Lou Reed came to him and told him not to mess around with his memory. That aside, the album is kind of perfect. It feels like a perfect snapshot of 2015 – though I can’t exactly tell you why. It’s basically a blues album with quirky lyrics and a heavy dose of nostalgia for the 60s, which most listeners of this album only know as myth. And maybe that’s it, FJM takes a shared myth, packages it in contrast to modern disaffection with, well, everything, and presents an image of something that folks in their thirties can all comfortably yearn for.
  6. Bully – Feels Like
    Imagine Hole put out a second album that was really good. Loud, infectious, unapologetic straight up 90s grunge. Fun as hell, and even better live. The album lacks depth and burns out rather quickly, but while you’re into it, it’s a blast, and unlike Hole, I’m really looking forward to their next album. I just found out about Hop Along (otherwise they’d have been on the list too) and I think fans of Hop Along will enjoy Bully, and fans of Bully will see in Hop Along the kind of depth and complexity that they will (hopefully) one day bring to their sound.
  7. Courtney BarnettSometimes I Sit and Think, Sometimes I Just Sit
    In terms of style, Barnett is just off on her own. It’s catchy story telling with inane yet compelling stories, the album is refreshing, funny, sprawling, interesting, and raw – and even better, it takes its name from a poster that Barnett’s grandmother has in her bathroom.
  8. Mandolin Orange – Such Jubilee
    I’ve been in love with these guys since I saw this touchingly beautiful performance (below) – rare are performers so in sync with each other. The album is wonderful, nothing particularly challenging or groundbreaking, just some excellent indie-tinged bluegrass.
  9. DrakeIf You’re Reading This, It’s Too Late

Before Hotline Bling stole all the attention, it’s worth pointing out just how good this album is. I’ve always kind of liked Drake, but never really got into him – but this album stayed on repeat for quite a while.

  1. Run the JewelsMeow the Jewels

From the backstory of where this came from, to the resignation that it was happening, and then the doubling down on making it right, I love everything about this. While the album isn’t the best thing out there this year by far, and it doesn’t come close to the original, the challenge that the producers gave themselves, and the fact that this ends up being an okay album instead of simply a novelty is damn impressive.

A Fierce Lion: Thoughts on Paris, Neo-Orientalism, and Facebook

“If one reads a book claiming that lions are fierce and then encounters a fierce lion (I simplify, of course), the chances are that one will be encouraged to read more books by that same author, and believe them. But if, in addition, the lion book instructs one how to deal with a fierce lion, and the instructions work, then not only will the author be greatly believed, he will also be impelled to try his hand at other kinds of written performance.  There is a rather complex dialectic of reinforcement by which the experiences of readers in reality are determined by what they have read, and this in turn influences writers to take up subjects defined in advance by readers’ experiences.”

– Said, Orientalism (1978, p. 93-94)

By now I’m sure you’ve heard of the recent attacks in Paris. And perhaps you’ve heard of the recent attacks in Beirut, and Baghdad, and Afghanistan. This post is not about the attacks, it’s not about the selective grief that has emerged and how the divergent responses by Facebook reveal underlying structural racism.

Instead, these are my preliminary thoughts on something that I find concerning. Of what Said has called “the seductive degradation of knowledge, of any knowledge, anywhere, at any time” (Said, 1978, p. 328). This piece draws on Said’s Orientalism, taking as a jumping off point the creation of the Orient in the most general way*. I would like to bring attention to the role that all media – in this case, specifically Facebook – plays in creating, maintaining, in Said’s words ‘structuring and restructuring’ particular understandings of the world.

The medium selects what is relevant to us

This is true both in theory and in practice. Theoretically, Facebook, Twitter, and other social media have the ability to selectively curate what we see in our social media feeds. This is not a particularly controversial observation – newspaper editors have always had the same power. However, while we are aware that newspapers have editors, I believe that we rarely think that what we see in our social media news feeds might be selected for us.

But of course it is selected for us.  Facebook’s normal setting displays only the ‘Top Stories‘ – which necessitates a filtering out of ‘not top stories.’ If you run a Facebook page, you cannot reach all your followers without paying extra.  Facebook has even been found experimenting on users by manipulating their feeds.

If Facebook decides (through whatever reason, payment, or algorithm) that something is important, it will get a disproportionate amount of attention on your news feed.

This is precisely what happened when Facebook provided a safety check in function for the Paris attacks, but not for other recent attacks. Again, with intent or not, Facebook curated the world for us, determining that Paris would be an issue worth caring about. Not providing a safety check in for other emergencies becomes action on the part of Facebook when juxtaposed against Paris.

When Facebook allows a user to check in that they are safe in Paris, but not in Beirut – and even if they do provide a safety check in for Beirut – it is because, for on reason or another, it is good for Facebook to do so.

What is especially important is that the medium is not a public service – these are private for-profit corporations. Their actions must ultimately benefit themselves and their shareholders, and they have no real obligation or responsibility to the public interest. Any claims to acting in the public good must be seen through the lens of corporate interest and profit. When Facebook allows a user to check in that they are safe in Paris, but not in Beirut – and even if they do provide a safety check in for Beirut – it is because , for on reason or another, it is good for Facebook to do so.

The medium shapes how we respond to what is relevant to us

After first shaping what is publicly relevant, the medium then shapes how we react to it.

By selectively highlighting relevant articles that you can share with your friends, NGOs that you can donate to with a click, or providing an easy way to show solidarity by changing your profile picture, Facebook exploits our emotional labour into doing work for them.

When we ‘check in’ in Paris, we create value for Facebook. Facebook becomes the place to go in an emergency to publicly announce your safety, and it becomes the place to go to verify the safety of your friends and family. It becomes necessary.

By limiting and encouraging a particular type of interaction with events of this kind, the interaction can be measured, monetized, and used. Facebook has found a new way to extract labour from our feelings.

When we change our profile photographs to a Facebook-decided image – in this case the French flag superimposed over your current photo – we do work for Facebook. How we express our grief, solidarity, or fear, is determined by the medium in service of the medium. By limiting and encouraging a particular type of interaction with events of this kind, the interaction can be measured, monetized, and used (again, see Scott’s Seeing Like a State). Facebook has found a new way to extract labour from our feelings.

We feel something and we don’t know what to do with that, but Facebook makes it easy. Not only can we broadcast our feelings, but in doing so we encourage others to do so. Importantly, we all broadcast our feelings in the same general way. A flag over a face is abstract, removed from specific feelings, so we all feel the same thing, and we all collectively feel better when the wave of French flag faces disappear.

And they will disappear, in a wave at the same time. Because Facebook has rolled out a temporary Facebook profile picture function which automatically reverts back to your old profile picture after a certain amount of time**.

It is so easy now to feel something and to do something about it. And Facebook makes it even easier by telling you when it is okay to stop feeling.

Neo-Orientalism and Facebook

Which brings me back, finally, to Said and the quote from the beginning of this post.

Part of Said’s argument is not just that the other is constructed, but that it is structured and restructured dialectically. One first defines the Orient, and then subsequent understandings of the Orient depend on that first understanding, ad infinitum. The knowledge produced in this manner has little relation to reality, and has everything to do with the power exercised by the producers of knowledge.  In the case of Orientalism, the knowledge was created in the service of Empire. In the case examined here, the knowledge is created in the service of private capital.

By defining what is important, and by providing us with the tools to interact with it, Facebook structures our understanding of the world. And that understanding serves Facebook’s interests.  Paris matters and Beirut does not because, for Facebook there is more to be leveraged from public expressions of solidarity about Paris than there would be about Beirut.  Importantly, the dialectic continues.  Paris – and how we interacted with that event – become a fact, a referent for future events. Beirut does not.  Other ways of interacting do not. Our understanding of the world is circumscribed to that which serves the interest of the medium.

How does this work in Facebook’s interest? Beyond entrenching Facebook as a necessary social tool in emergencies, there are many opportunities for profit, most notably the exploitation of the information they are gathering through nurturing our philanthropic impulses. A good example comes from Facebook’s new donation feature for selected charities.

When it comes to these donations, Facebook will retain your payment information for future purchases and encourage you to share the fact that you have given. But what they will never do is share your information with the charity that you are donating to.

By inserting themselves as an intermediary between a Facebook-selected charity, and an online population that they are encouraging to become engaged, Facebook becomes invaluable to charities and NGOs.  And I repeat, “non-profits will not have access to donor contact information when someone makes a donation on their Facebook Page.” Facebook also handles all of the payments.

When Facebook first launched their Pages feature, they were free and operated without restrictions. Once they became indispensable for businesses, they changed the way the Pages worked, requiring owners of pages to pay in order to fully access their followers.

While I do not think that Facebook will start to charge users to use the check in feature, I doubt it will be long before Facebook starts allowing charities – or other interested parties – to pay to access the list of people who donated in the past.

What is particularly interesting and insidious about this, is that Facebook creates this new opportunity for profit by manipulating the way in which we see and interact with the world. They create the lion, they decide that it is fierce, and they show us how to deal with it. And through this sequence, they extract labour and value from our grief, our solidarity, and our fear.


* I’m not positioned to do the intellectual labour necessary to situate this within a broader orientalist discourse, but would be eager to read any work that has done this.

** I believe you can set how long this is, but most people opt for the default option for most things, so the likely outcome is the default., metadata, and freedom

Recently, I came across this piece on that dovetailed nicely with a lot of things that I’ve been thinking about lately.

A friend asked what we are supposed to do about it, and since I couldn’t come up with a pithy, sarcastic facebook status, here’s my long-winded answer. I should note that I am in no way an expert on any of this, and that there is a ton of better information out there if you’re willing to look for it. This is likely going to be the first post in several related to this topic.

‘Should This Be the Last Thing You Read on’

The piece is worth reading in is entirety, but essentially, it demonstrates that posting research on is not the equivalent of open-access, particularly because (surprise) is a company.  The business model of is not unique, but it’s also not something that I think a lot of academics have paid much attention to:

“The goal is to provide trending research data to R&D institutions that can improve the quality of their decisions by 10-20%. The kind of algorithm that R&D companies are looking for is a ‘trending papers’ algorithm, analogous to Twitter’s trending topics algorithm. A trending papers algorithm would tell an R&D company which are the most impactful papers in a given research area in the last 24 hours, 7 days, 30 days, or any time period. Historically it’s been very difficult to get this kind of data. Scientists have printed papers out, and read them in their labs in un-trackable ways. As scientific activity is moving online, it’s becoming easier to track which papers are getting more attention from the top scientists. There is also an opportunity to make a large economic impact. Around $1 trillion a year is spent on R&D globally: about $200 billion in the academic sector, and about $800 billion in the private sector (pharmaceutical companies, and other R&D companies).”
– Richard Price, CEO of, quoted in Hall.


In other words, there is significant value in your activities if they can be captured. Knowing what you search for, what you read, and who you look up and contact, is potentially very lucrative for and the companies they are looking to sell this data to. For a company like, the best way to be successful is to collect as much metadata as possible on as many people as possible.


If you don’t know what metadata is, here is a good primer from Edward Snowden, and a related discussion of linkability from Jacob Applebaum. Basically, metadata is data about data.  It is important because it can tell us a lot about a lot of things.  For example, metadata on cellphone calls would include times of phone calls, durations of phone calls, the two participants in the call, etc.  What it does not capture is the content of the call. Linkability is essentially the ability to connect metadata. With linkability and enough metadata, the content of data becomes irrelevant.


This is not new, and many internet companies are based on this model – Facebook, Google, pretty much any online service that is free. And as consumers, we are generally okay with using free services knowing that they are collecting data on us.  Here are a few reasons why we should rethink our relationship with these free services.

1) Metadata does not accurately describe anyone

Metadata is nothing more than an abstraction about you and your activities.  For example, it is not the content of your phone call, it is who you called and when.  So it cannot reveal what you and the store clerk spoke about, but it can demonstrate that you were both there at the same time. I must stress that metadata is not inherently evil, but it does perform a specific task of simplification, of abstraction, and ultimately, provides new mechanisms of control.  In ‘Seeing Like a State‘, Scott articulates this point nicely:
“Certain forms of knowledge and control require a narrowing of vision. The great advantage of such tunnel vision is that it brings into sharp focus certain limited aspects of an otherwise far more complex and unwieldy reality. This very simplification, in turn, makes the phenomenon at the center of the field of vision more legible and hence more susceptible to careful measurement and calculation. Combined with similar observations, an overall, aggregate, synoptic view of a selective reality is achieved, making possible a high degree of schematic knowledge, control, and manipulation.”
(Scott 1998, p.11).


A large amount of metadata allows the behaviour of a large number of people to become legible. Once legible, it can be used: by understanding aggregate groups of individuals through metadata, these groups can be engaged with, knowledge can be extracted about them, and products can be sold to them. uses metadata to understand what academics are reading in order to scoop new technologies and understand the direction research will go in, while the American government “kills people based on metadata.


Again, metadata is not bad in and of itself, but we should be wary of its use and our complicity in perpetuating its use.  The problem is that metadata is an abstraction, and institutions use this metadata to determine how they engage with individuals.


In other words, imagine there is a forest, and in that forest there are many trees.  A surveyor sees the forest from a plane, and says “there are two types of trees – those that have dark green leaves, and those that have light green leaves.”  This is the metadata, it leaves out a tremendous amount of information about the forest – and more importantly, it is organized according to a human being or institution that has a particular interest in organizing the forest in a particular way.  Say trees with dark green leaves have stronger wood (I don’t actually know anything about trees).


Now the forest is legible in a certain way, and because it is legible, it can be used.  Forests with more dark green leaves are cut down first for lumber, and only trees that will produce dark green leaves are replanted, without any regard to the dynamics of the forest itself.  The interesting thing is what happens with those trees that do not have dark or light green leaves – but have something in between.  They must become one or the other. They must fit the categories imposed by the metadata to be useful.


When the world is categorized in a certain way, the world can only be understood through those categories, and institutions have incentives to find ways to ensure that these categories are reproduced so that they can continue to understand the world.


This matters because while we are not merely our metadata, institutions can only understand us through it.  This may not be a huge problem for a company like that (let’s be honest) doesn’t create a lot of value for its users, but it is a serious problem when we are talking about States or institutions that are more critical to your daily life. For example, if you live in a particular neighbourhood, and you have a particular kind of job, and you are a particular age, and a particular race, the risk of a company insuring you is going to be based on metadata profiles of similar users.  Ditto for bank loans.  This is not some dystopian future, it is simply the current way of doing business.

2) Metadata is not actually anonymous

States and institutions conceptualize the world in a certain way and then use that abstraction to draw inferences about individuals.  They can say confidently that 80% of 40-year-old Torontonians use the TTC, but aggregated metadata cannot say that Jon Smith uses the TTC.  In theory, metadata is anonymous, as it does not (or should not) contain identifiable information. This is one of the big arguments for why we should not be concerned with the collection of metadata.

The problem is that with enough metadata, and with linkability, it is not particularly difficult to connect metadata to individuals.  A recent MIT-led study did exactly this. They analyzed “3 months of credit card records for 1.1 million people and show that four spatiotemporal points are enough to uniquely re-identify 90% of individuals.” So while each element of metadata may not uniquely identify you, if there is enough of it, and it has elements in common (such as an IP address), then going from the abstract to the individual is not difficult.

The point here is that institutions do two things – first they create sketches of swathes of groups based on metadata. But then they can also create a sketch of individuals based on their own metadata. An institution thus understands you through two overlapping abstractions. You become a distorted, simplified image – and it is with this distorted, simplified image that states and institutions act.


3) Your labour is being exploited without your knowledge and without remuneration

By participating in these services, by supplying a constant stream of metadata, you are working for these services. Clicking on a paper on creates value – and ultimately money – for the service. In short, you are performing labour and not being remunerated for it. Indeed, the most disturbing thing here is that many people do not see that they are performing labour, and do not realize that it is being exploited.


Take for instance CAPTCHA. In order to ensure that a human and not a robot is accessing certain sites, you have to perform a task that only a human can perform. Reading scanned text, or identifying images that contain cake.  A lot of this is for good, for example, Google is using this data to help digitize books – arguably not a bad thing. Indeed, the motivation for doing so seems pretty legit (this is worth watching).


Importantly, when we encounter a site that has a CAPTCHA, we cannot opt out. And we do not know that we are performing labour. We do not know that our actions are being used for profit, and we do not know how. We don’t know what books we are digitizing, because we didn’t know that we were digitizing books.


Another example is GOOG-411. If you remember, you could literally call the internet, and then get Google search results over the phone. But the point of GOOG-411 wasn’t to allow you or I to call the internet, it was to collect voice data – to improve voice recognition and text-to-speech algorithms. So unwittingly, GOOG-411 users laboured for free to benefit a for-profit institution.


For some, this won’t be a problem.  You may think “whatever, I was going to do the labour anyway.” But you weren’t.  For example, CAPTCHA is designed to extract specific labour from you, labour that you weren’t going to do for fun – unless you willingly spend your afternoons selecting the pictures that contain bananas from a random set of photos?  These services are designed to extract a specific kind of labour from you, and to do so in such a way that you do not see it as labour.


Worse is that they extract from the public in order to enrich the private. As Hall notes, “But just as Airbnb and Uber are parasitic on the public ‘infrastructure and the investment’ that was ‘made by cities a generation ago’ (roads, buildings, street lighting, etc.), so has a parasitical relationship to the public education system, in that these academics are labouring for it for free to help build its privately-owned for-profit platform by providing the aggregated input, data and attention value.”

As citizens, we have helped to pay for the scaffolding that supports a service like, which in turn uses that scaffolding to extract more value from us.  It may all be in the abstract, but at the end of the day, our labour is being used to make someone else money. And it is often done without our consent or knowledge.

Metadata & Freedom

The reason why any of this matters is because it negatively impacts the freedom of individuals.  The most obvious problem is that of privacy. Massive amounts of metadata and linkability destroy privacy, which last I checked is still a fundamental human right. The State’s use of metadata in the name of national security is particularly problematic, and has been written about extensively elsewhere.  But at the end of the day, without the privacy to learn, to read, to look at whatever websites we want, to speak to whoever we want, we are not free. We will self-censor, we will hesitate to discuss certain topics, we will curtail our own freedom.

Relatedly, when institutions prefer a world of dark and light green leaves on trees, we as individuals are systemically encouraged to see the world through that lens. Institutions frame how we understand the world around us, and they frame it in ways that are beneficial to the institutions, not to the citizenry.

A lot of institutions are collecting data about your online activities. The governments of the US and UK collect pretty much every single packet that gets transmitted on the internet.  That has terrifying implications. If everything is being collected and stored, then institutions have the capability to retroactively construct narratives of individuals based on metadata. The NSA’s XKEYSCORE program does just that.

So what can be done about it? It is exceptionally difficult to remain completely anonymous and private while online (not impossible though), but it’s not particularly hard to make it more difficult for people and institutions to use your data and your labour for their own gain.

Here are a few suggestions, I’ll get into more in a future post:

1) Constantly remind yourself that free services aren’t free. It is important to remember that data is always being collected about you in order to try to understand you. What is being done with this data is usually unknowable, so you should always ask yourself how necessary this service is to you before you volunteer to allow an institution to collect your data.

2) Disengage. Obviously the best thing to do is to stop using these services, especially the ones that don’t bring you much value. But for a lot of that, that’s not possible or desirable.  Even if you do disengage though, the metadata collection doesn’t end.  Facebook, for example, has ‘shadow profiles‘ of all its users that contain data that others report about you. More frighteningly, they have ‘shadow profiles’ of people who do not use facebook. So it doesn’t matter if you use facebook or not, they’re collecting data on you. A better solution may be active engagement with an eye to disruption.

3) Participate selectively. In short, don’t participate in ways that are useful to those who are collecting your data. Much of our lives as academics are already on the internet – our affiliations, research interests, etc. So this information is not particularly valuable to an institution like They extract value by understanding your behaviour, so don’t behave.  While it may be valuable for you have a paper posted on for exposure, it only takes a second to find someone’s e-mail address, or to find their publication elsewhere.  If you limit your interactions through the service, you help limit the data that is being collected about you.

4) Disrupt. Others’ collection of metadata cannot really be avoided, but you can do a lot to disrupt it and make it more difficult for others to know much about you.  For fun, try clicking on random things on your Facebook feed – the targeted ads will change as Facebook tries to incorporate this new behaviour with what they already ‘know’ about you.  Do this enough, and you introduce enough noise into their data that the real you fades into the background.

Of course there are a lot of better ways to disrupt and make it more difficult for others to collect data about you. Use a VPN. Block cookies. Use HTTPS. Use an encrypted text messenger. Use a password manager. If you have serious needs for anonymity, use TOR. Oh, and for the love of god, use PGP.

A lot of this seems like it’s overkill, but most of it is extremely simple to integrate, and operates in the background.  I’ll write more about these things in future posts.

Disruption is also kind of fun. It allows you to continue to use the services but you also get the satisfaction of knowing that you’re corrupting their clean data of you. I like to imagine that one day, just maybe, some poor data analyst will come across my file and won’t be able to find anything. Then they’ll be sad, and frustrated, and won’t know what to do – it will be difficult for the State to peer into my life.

And that’s exactly how it should be.