Data Collection and Learner Control: Moving Beyond inBloom

5 min read

In conversations about student privacy and data collection, we often go off track due to a lack of clarity about what information needs to be collected, and with whom that information needs to be shared.

Some of the federal data collection requirements at the K12 level are expressed and described in the information describing EdFacts. The file specifications appear to vary by year; for the 2012-2013 school year, states needed to report on over 100 topics, ranging from migrant students supported, special needs students supported, disciplinary incidents, and types of staff evaluation.

This is aggregated data, submitted by the states to the federal government. This data does not contain personally identifiable information, although some data submitted by states does get down to the school level. However, for states to prepare this information for the federal government, state educational agencies need student- and teacher-level data, unless school districts each prepare individual reports. More importantly, for schools to meet the needs of individual students, schools need student-level data.

As an example, states file discipline records (word doc) that describe incidents that involved a student being removed from their "regular educational setting for at least an entire school day" if the "incident is a result of drugs, alcohol, weapons possession or violence." The data reported here gets pulled from student-level incident reports. In the CEDS data schema, we can see the data that would need to be collected at the school level (which would contain personally identifiable information) in order to satisfy the federal reporting requirements (which DOES NOT contain student information).

It's worth noting that both inBloom and EdFi implement the CEDS data standard, and that the featureset of Pearson's Powerschool claims "Push-button State Reporting. A scan of the prebuilt reports in Powerschool (pdf download) includes reports on grades and discipline incidents that include personally identifiable student data. While inBloom has drawn nearly all of the fire for privacy issues, both EdFi and Powerschool collect comparable data sets for more students, and have been doing so for years.

The time required to manage and prepare data for state and federal reporting requirements is onerous in the best of circumstances. A common data standard like CEDS simplifies the process. The value and merits of CEDS, and the business opportunities that data reporting creates, are both valid points of concern, but until the federal reporting requirements are streamlined, schools are officially in the data management business (The fact that CEDS and the Common Core Standards were both shepherded into existence by the same organization is either an indication of organizational focus or iron-clad evidence of a Conspiracy, depending on your viewpoint).

Recently, the near-exclusive focus on inBloom works to the detriment of addressing comprehensive concerns about student privacy. Compared to other companies offering datastores for student data, inBloom offers some key advantages: their codebase is available under an open source license, and their datastore is freely available online. This means that any entity who wants to can stand up a datastore on their own hardware, without interacting with inBloom at all. This also means that, unlike other data storage services currently in operation storing personally identifiable student data, third parties can do security audits on the code. With inBloom, we don't need to take the vendor's word on security.

But to be clear: ALL of the current options raise serious privacy and security concerns. These concerns predate inBloom, and if inBloom were to disappear tomorrow, the same security and privacy concerns would continue to exist. As a result of federal reporting requirements that predate both inBloom and CCSS, nearly every district is currently storing data electronically, with predictably mixed results. There is no such thing as 100% security, in IT or in life.

Additionally, this post is focused on the privacy and security concerns raised by data collection done in schools to meet state and federal reporting requirements. A related and equally important topic is the privacy and security issues raised by the use of third party services like Google Apps, Bing for Schools, Khan Academy, Schoology, Edmodo, etc - all of which collect, store, and share student data.

One thing we can all agree on is that the existence of inBloom raised awareness around issues of privacy. These issues existed before inBloom, and they have yet to be addressed. Privacy and security of student data need to be looked at via policy that spans data management tools. One element conspicuously absent from all data management plans is the ability for learners, parents, and teachers to review, vet, and opt out of sharing the data stored about them. Arguably, any data sharing outside the school should require an opt-in, with student privacy as the default. Another element missing from data management policies are concrete sunsets on data retention. Both of these components relate to the real elephant in the room when it comes to managing student information: ownership.

In an ideal world, the data trail created by a learner would belong to the learner (The fact that this paradigm would put companies like TurnItIn.com out of business is an added bonus). Advocates for increased data collection tend to overlook the reality that education doesn't exist to provide an area to be researched or assessed; education exists to serve learners. To the extent that we collect data about the process of learning, our data collection needs to serve learners as well.

, , ,