Google Books Settlement: Now featuring me

I’ve blogged twice about the Google Books Settlement (here and here), in addition to following it at considerable length on Open Access News. Now, I’m part of it!

A footnote in Pamela Samuelson’s objection tipped me off:

Most other signatories [to the brief] … are members of the Author Subclass by virtue of the book-bound copies of their Ph.D. dissertations filed in research libraries of the universities from which they received their degrees.

I hadn’t realized that dissertations were eligible “works” under the terms of the settlement. That means my late mother’s dissertation, for which I now exercise copyright, would be subject to the settlement terms. An email to the class counsel confirmed it.

I’ve claimed the work on the settlement site. It’s not listed as having been digitized. I’ve set the options as permissive as possible, including a zero price for consumer purchase. (The promised option to apply a Creative Commons license isn’t yet offered.)

I don’t intend to opt out of the settlement: I support the further availability of the work (which, on balance, I think the settlement would increase). If participation brings any financial benefits, well, I’ll take them.

However, I’m entertaining the thought of joining an objection to some terms of the settlement. Since I missed the deadline to object to the original proposed settlement, I’ll only be able to object to terms revised in the amended settlement, but there’s still plenty to be wary of. My concerns are primarily competition, users’ rights (open formats, DRM, privacy), and facilitating rightsholder choices more permissive than the settlement defaults (including open access). If you know of an objection which addresses these points and is accepting additional signatories, please let me know, in the comments or by email.

Nitpicking the Google Books Settlement 2.0

I previously posted on the Google Books Settlement, avoiding the well-trod ground and focusing on points that were salient but hadn’t received much discussion. Now that there’s a new draft of the proposed settlement, I’ll do the same:

  • The revised settlement cuts out a huge swath of international works. There’s no legal reason for this, since the settlement is based in U.S. law, which treats works equally regardless of where they were published. (Moreover, the settlement only provides access to users in the U.S.) Yet I haven’t seen one public interest advocate criticize the loss to access that will be the result of this change.
  • The discussion of this change has mostly been summarized as “foreign language works are now excluded”. But that’s a misleading oversimplification. The new settlement includes works published in the U.S., UK, Canada, or Australia, or registered with the U.S. Copyright Office. That will certainly include many non-English works (remember Canada is bilingual?). It will also exclude many English works: consider New Zealand, Jamaica, India, or many other English-speaking countries.
  • Much criticism has focused on the question of orphan works. This is a bit baffling to me. The settlement would provide an unprecedented access to orphan works. Indeed, to me this is the biggest benefit of the settlement.

    The main criticism of this is that Google would be the only provider of access to these orphan works. Monopoly access is certainly undesirable (particularly given the other flaws of the settlement: the privacy weaknesses, the DRM, the single interface, the overall market position of Google, etc.). But isn’t monopoly access (with antitrust scrutiny) better than no access?

    The only way the answer is “no” is if the settlement holds back progress toward non-monopoly access. For instance, a settlement clause that guaranteed Google competitors the same terms (even if they had to do the scanning themselves) would open competition. Obviously, Google is not interested in such an approach, and since the settlement is a negotiation between Google and the plaintiffs (who I would guess to be agnostic on that question), we shouldn’t expect to see those terms unless the judge or the Department of Justice forces them.

    A legislative solution, such as proposed by the Copyright Office, would be an improvement as well. But orphan works reform has so far stalled in Congress, and I haven’t seen any indication it will be a priority for the current Judiciary Committee. For its part, Google says it will still support orphan works reform if the settlement is approved.

    I’m not sure how to predict what the effect of the settlement would be on the prospects for legislative action. One the one hand, Congress might say, “It looks like Google has solved that problem, so we don’t have to do anything.” Alternatively, Congress might say, “That Google settlement seems to have riled a lot of people up; I’d rather not put my stick in that antpile.” On the other hand, the settlement might give greater impetus to Google’s competitors to tell Congress, “We’re on unequal terms now; we need you to pass orphan works reform to level the playing field.” No matter what happens, I don’t expect this Congress to pass orphan works reform. How long are we willing to wait?

  • Speaking of orphan works, the Unclaimed Works Fiduciary is a trustee with one hand tied. As I reported for OAN, the UWF — an independent agent entrusted to manage the works of rightsholders who haven’t claimed their works under the settlement — doesn’t have all the powers of an actual rightsholder. Whereas a rightsholder is guaranteed under the settlement the options to, e.g., set a zero price for her work, to apply a Creative Commons license, or to remove DRM, the UWF isn’t guaranteed those same options. In fact, the UWF can only exercise those options with the approval of the Book Rights Registry, which is run by publisher and author representatives. So if the UWF came to the conclusion that the best fiduciary interest of its absentee rightsholders was represented by making their works freely available, it would not necessarily be able to do so. Given the growing suggestions that making a book freely available often has no discernible negative consequence on sales revenues for that book, and in some cases may even increase sales, the settlement should not exclude that option.
Happy Open Access Week

Open Access Week

In late 2006 or early 2007, I was looking for ways to get students interested in open access. I had started to become versed in the topic myself a few months earlier, after my library announced it planned to cut subscriptions around the same time the Federal Research Public Access Act was introduced for the first time. At the time, there were no resources for students and no student organizations meaningfully engaged with the issue. I helped the Alliance for Taxpayer Access scrape together some basic information for and about students, but no one paid much attention.

At some point, I had the idea of picking a day to try to focus student attention on open access. We’d choose a date and ask our few student allies to organize some activities to speak out on the issue. This became the National Day of Action for Open Access.

We didn’t have much lead time to plan, and few resources. Not a lot of people participated — but a few did. There wasn’t much attention, but we did get an article in the Washington Post, where I went completely off-message. (Coincidentally, the reporter was Rick Weiss, who later edited Science Next, which included an essay by me about open access.) It was a start.

By the next year, I was consulting for SPARC. We decided to revive the concept, but shifted the schedule and the focus: not just students, we wanted everybody to make noise about open access. For Open Access Day 2008, we had more time and more resources. In organizing it, I dropped the ball too many times, but thankfully someone was always there to pick it up. The response was much bigger; we made a splash.

After 2008, the organizers made two strategic decisions which I disagreed with at the time but were absolutely right. One was to expand the day to a week to make scheduling easier. The other was not to organize a central event, but instead to rely more on the partners and hosts to take more initiative. I was afraid we’d have insufficient focus and momentum. Instead, we let a hundred flowers blossom. The more flexible schedule, along with an increased role for partnerships — and our experience and increased visibility from the first time around — combined to make Open Access Week the most vibrant outing yet. The breath and depth of activities worldwide, along with a number of high-profile announcements timed for the week, are truly remarkable. I haven’t been very involved since the early strategic planning, so I can’t claim much credit. But I am thrilled and impressed with the outcome.

Most personally touching for me are the events in Cuba. Growing up in Florida, Cuba was only 90 miles across the strait but impossibly far culturally. There is no direct fiber optic link, nor even direct postal service, between Cuba and the U.S.; as an American, I need special permission from my government to travel there. Reportedly, only 2% of Cubans have Internet access. So it was a revelation to realize that our message of open access to scholarship had resonated in Cuba. For me, it’s a symbol of what open access is all about: the free exchange of knowledge and ideas worldwide.

Happy Open Access Week. May it be the first of many.

Scholarly publishers shake down a copy shop

A group of scholarly publishers — Blackwell, Elsevier, Oxford University Press, Sage, and Wiley — last week won a judgment against a Michigan copy shop for assisting students in copying course packs. The students were copying articles from scholarly journals and chapters from scholarly books for assigned readings in their college classes.

A student wanting a coursepack comes to Excel’s [the copy shop] premises and fills out a form on which the student writes the course the student is enrolled in and for which the student needs the material. The form contains a statement to the effect that: “I am a student in this class and am making a copy for educational purposes.” The student signs and dates the form. The student hands the form over to an Excel staff member who retrieves the “master,” hands it to the student, who then makes a copy using Excel’s copy machines. [...]

Excel does not pay copyright fees to the publishers, which it admits enables it to charge a lower fee than if the students obtained the materials at a traditional “copyshop” [...]

Excel’s position that this is a case of protected student copying is sophistry. [...] Simply put, copyright law should not turn on who presses the start button on a copier. Excel’s actions violate the publishers’ copyrights.

My purpose is not to argue the legal merits of the decision. Rather, I want to highlight this case as an example of the social impacts of closed-access scholarly publishing. I particularly want to address researchers here.

Scholars: You conducted your research for the advancement of knowledge. In many cases, your research was supported by taxpayer dollars, whether in the form of a research grant or a university salary. You entrusted your research to the publisher, for the purpose of disseminating it. In many cases (for scholarly journals, not necessarily for books) you did so for no remuneration from the publisher. The publisher sells access to your work to universities and reaps massive profits: Elsevier alone reported more than $800 million in profits in 2008. When a small business tries to help students get access at a reduced price, the publisher sues to shut it down.

If that’s scholarship, then I want no part of it.

The publisher is wielding the copyright in your work as a legal bludgeon and supposing to act on your behalf. If you know this and you sign a copyright transfer with a publisher, then you are responsible.

There is an alternative.

For reference, the list of infringed works is here. Some are more than 20 years old.

AcaWiki launches: free summaries of academic papers

As I reported at Open Access News, AcaWiki launched yesterday. The idea is free (gratis, libre), editable (wiki) summaries of academic papers. These summaries might be useful to scan during a literature review or when studying for a class, or they might help make an article comprehensible to a non-specialist (a researcher in another discipline, an interested member of the public).

So what’s the point of AcaWiki when almost all articles have abstracts, which are summaries and usually available gratis? Well, AcaWiki summaries are also libre (CC Attribution license), so they invite reuse: mashup, translation, and so on. They’re also editable, so they can evolve and be improved.

Abstracts vary widely, usually shaped by the journal’s format: sometimes they’re several paragraphs, something just a few sentences. They might outline the methodology or they might not. They are usually written at the level of specialists in that field, so they may or may not be much use to other readers.

There’s room for improvement and innovation in the world of summary, in other words. For instance, Emerald launched a program asking authors to provide a summary highlighting potential applications. RNA Biology requires its authors to write up their findings on Wikipedia. BMJ publishes only one-page abridgments in its print edition, with the full article available online.

For a more direct comparison, see WikiSummary, which predates AcaWiki but covers only political science.

Two other points of comparison: journalism / press releases and Wikipedia.

Press releases are gratis; science journalism may or may not be gratis; both are rarely libre. They only cover new studies: good luck finding coverage of an article from 1989. They rarely provide a full citation to the original article. They often discuss only the findings, with little consideration of methodology. They frequently focus on studies with controversies or practical applications, rather than new theories or research methodologies. In reporting the most interesting (a.k.a. most titillating) of the findings, journalism sometimes distorts the impression of the overall study. Meanwhile, press releases try to paint the most positive picture. Since they’re written for a general audience, and often not written by someone with a background in the field, they may be too general.

If we consider research blogging in this category, conversely, the writing may be too technical. It may be more commentary or critique than summary.

Wikipedia is gratis and libre. It’s written for non-specialists (in theory), but can also go into more detail. The main difference from AcaWiki is that most academic papers will not be “notable” enough to merit their own Wikipedia page; even if someone wrote them, they would probably get deleted. As an encyclopedia, Wikipedia provides a higher-level overview. There could be some other conflicts with Wikipedia policies, such as those against publishing original research or authors writing about themselves or their work.

All of the aforementioned resources have their uses, but as we can see, AcaWiki has its niche. I hope it thrives there.

In disclosure, I did some paid work for AcaWiki some months ago, but am not actively involved in the project.

Funding a transition to OA

As I mentioned in my last post, a group of American universities has signed an agreement to finance open access journals. The previous post alluded to my criticisms of the compact and I’ll flesh them out here.

It’s a big step forward and Harvard has already followed up on its commitment. I hope to see the other universities do likewise in short order, and to see other schools sign on as well.

Stevan Harnad is right that, without also ensuring OA to their research output by adopting a “green” OA policy, funding “gold” OA journals is well-intentioned but ineffectual. Harvard’s fund lines up perfectly (and hopefully the policies will spread to the rest of Harvard’s schools soon); of the others, only MIT currently has a self-archiving mandate. While more support for OA publishing is needed and valuable, universities could do more in the short term by adopting OA mandates.

My main gripe with the compact itself is that it only covers funding for publication charges, to the exclusion of other financing models. I don’t have a problem with publication charges when done right and have even suggested more experimentation with submission fees. (Interestingly, the new Harvard fund explicitly includes submission fees as eligible.) But fewer than one-third of OA journals currently use publication charges: 70%+ rely on other revenue sources (or have no budget at all).

Stuart Shieber, the architect of the OA compact, knows this — he’s the one who did that calculation — but he’s convinced it’s a fluke. In the PLoS Biology article where he introduced the compact concept, Shieber wrote, “processing fees are the only revenue source that inherently scales directly with the publishing services provided by a journal”. In other words: Some weirdos here and there might get their money from somewhere else, but the only way to take OA publishing big time is with processing fees. But I’m not convinced that that’s the case.

First of all, academic publishing has traditionally been a constellation of weirdos and edge cases. Academic publishers include giant publishing conglomerates and boutique commercial publishers, massive scholarly societies and much more esoteric ones, government agencies and think tanks, universities and some guy publishing out of his department office. Some turn a profit, some break even, some lose money, and some have no budget at all. Some are subsidized by members or university departments, and some subsidize the organization’s other activities. It’s a motley bunch and, now that information is divorced from its paper container, I think predictions that any one revenue model will dominate are perilous at best.

Never mind the fact that even many OA publishers who charge processing fees also draw revenue from other streams: reprints, institutional and individual memberships, print subscriptions, philanthropic or public underwriting, subsidy by the host organization, in-kind support, and so on. Even the practitioners of processing fees see fit to diversify their revenue base.

The compact acknowledges this, but doesn’t do anything about it:

Many, indeed most, open-access journals do not charge processing fees. Such journals are no less deserving of support, and universities are urged to support them as well (as many already do), through direct subvention, support for personnel, equipment, and other facilities. However, the compact was not seen as the right method for institutionalizing this support.

So kudos to the compact’s designers and signatories for committing to put some much-needed money into “gold” OA. The quicker we can flip journals — or build new quality journals and let any dinosaurs that refuse to evolve eventually die off — the better; cash is a very good incentive for publishers to make that happen. But I don’t see why the compact couldn’t have been a commitment to fund OA journals in general rather than to fund publication charges at OA journals.

Lead, follow, or get out of the way

Harvard and 4 other universities did something neat recently: they agreed, in principle, to help finance open access publishing. Of course, the devil’s in the details (more on that in a future post), not least of which is that, at the time of the agreement, none of the schools had actually dedicated any money to match their commitments. Still, it’s a start, and it should be music to the ears of publishers — most of whom have beat a constant rhythm of “open access is all well and good, but we just want to get paid” — or so you might think.

Robert B. Townsend, assistant director for research and publications of the American Historical Association, said he was skeptical of the compact, at least based on what was released Monday.

“My ambivalence is the utter lack of clarity, and the tendency in most open access discussions to treat the science journals as normative,” he said. “The lack of recognition of the vast differences between disciplines makes this look like more of the usual one-size-fits-all open access thinking that prompted our efforts on the [National Humanities Alliance] report. I hope that report will have some effect on their thinking, if and when these universities try to turn their words into deeds, but I am not optimistic.”

It’s hard to guess the exact context of Dr. Townsend’s comments, but his comments here seem to be directed against the OA movement in general. He doesn’t criticize any specific aspect of the compact. It’s hard to interpret his intent as anything but obstructionism: “Stop — we haven’t figured out how to make this work for the humanities yet.”

The NHA report is generally a thoughtful look at publishing in the social sciences and humanities, with a particular idea as to how OA might work there. (The biggest flaw of the NHA report, which I mentioned in my comments on OAN, is ironically the biggest flaw of the OA compact: They devote all their energy to publication charges and pay only lip service to any other funding models — despite the fact that fewer than one-third of OA journals use publication charges.) Of course, societies in HSS are justified in trying to manage the transition to OA in the way that least disrupts the journals, but to date they have in general been overly cautious.

One could go farther and say that a little disruption can sometimes be a good thing. The NHA study concluded that the per-article costs in HSS journals are three times that of STM journals. I have yet to see any evidence that HSS journals are three times as valuable to their readers. In another context, those numbers would elicit adjectives like “bloated” and “inefficient”. Of course, in academia every field is exceptional and no one would suggest such a thing here. Each discipline must be allowed to retain its peculiar traditions, regardless of cost, because by God that is just the way things are done. But I digress.

There is indeed one size that fits all when it comes to scholarly knowledge, and that is that scholarly knowledge ought to be free. That certainly does not mean that every journal, publisher, or discipline must use the same revenue model, but they all need to get us to same outcome. The HSS community has not seriously contested that principle, but so far they have shown a lack of vision and creativity in getting us there (a charge I’ve levied against others before). If HSS societies are “not optimistic” about the OA compact or other current OA efforts, I should like to see them undertake more experiments of their own, rather than criticizing without proposing real alternatives. OA works for thousands of journals, hundreds in the social sciences and humanities, for many different types of journals and with many different revenue models. Let’s stop dragging our feet and make it happen.

Advice on email for political campaigns

Email addresses are the coin of the realm nowadays in political campaigning. More political efforts — whether candidates, partisan groups, or advocacy organizations — ask for your email address than probably any other piece of contact information. And email addresses matter — at least, people are starting to suspect they do. Recently, I heard a rumor that contacting Organizing for America from the same address you used to donate to the Obama campaign would have more impact. I vaguely recall hearing similar advice about applying for jobs during the transition. It seems that your email address is an increasingly important identifier, beyond just a means of communication.

But political groups are widely getting it wrong when it comes to a minor, but valuable, use of email addresses as identifiers.

The issue is that there’s more than one way to represent the same piece of information. For instance, you might write a phone number any of these ways:

  • (555) 555-1234
  • 555-555-1234
  • 555 555 1234
  • 555.555.1234
  • 5555551234

Similarly, the same street address might be written in different ways:

  • 123 Any Street, Apt. 4
  • 123 Any St., #4

And so on. Any database is certainly going to recognize those phone numbers and street addresses as the same thing, so the same person doesn’t get 5 different phone calls and 2 different mailings. But the same thing doesn’t happen with email when people use sub-addresses.

I’ve written about plus-addressing before. I’m under no illusion that sub-addressing is used by a massive portion of the population, but I figure it’s probably used by at least a few percent of people — and for those people, it’s important. They use it for a reason: to better filter their mail (so they can find your message more easily!), to track how their address is shared, etc. These people will notice whether you respect their wishes and habits, and it will influence their impression of you. So it behooves you to play nicely with their addresses, especially since it’s so easy to do.

All it takes is two easy steps:

  1. Your Web forms (and any other methods you use of collecting and managing email addresses) should respect sub-addressing. Subscription forms frequently reject + as an “invalid character” — but according to RFC, it’s not. Even more annoying is when a subscription form accepts + but the unsubscribe option doesn’t (and even worse when there’s no other apparent way to get an address off the list). For people who use sub-addressing, this is an unnecessary hassle — not to mention a potential violation of CAN-SPAM.
  2. Your database should collate sub-addresses. In other words, if I’ve asked you to contact me at, you should contact me there. But if you also have and in your database, you should know that they’re the same user. The benefits of this will depend on exactly how you use those data, but I think it’s a good principle to start from. Importantly, whatever you do should be transparent and modifiable to the user. I can imagine, for instance, logging in as and seeing a message that says “We also have in our database. Would you like us to merge those identities?”

Bonus transparency best-practice: Don’t mask the TO: field — otherwise, recipients can’t tell where you’re contacting them.

