Design choices for productive, secure, data-intensive research at scale in the cloud

Diego Arenas, Jon Atkins, Claire Austin, David Beavan, Alvaro Cabrejas Egea, Steven Carlysle-Davies, Ian Carter, Rob Clarke, James Cunningham, Tom Doel, Oliver Forrest, Evelina Gabasova, James Geddes, James Hetherington, Radka Jersakova, Franz Kiraly, Catherine Lawrence, Jules Manser, Martin T O'Reilly, James Robinson, Helen Sherwood-Taylor, Serena Tierney, Catalina a Vallejos, Sebastian Vollmer, Kirstie Whitaker

August 2019

Abstract

We present a policy and process framework for secure environments for productive data science research projects at scale, by combining prevailing data security threat and risk profiles into five sensitivity tiers, and, at each tier, specifying recommended policies for data classification, data ingress, software ingress, data egress, user access, user device control, and analysis environments. By presenting design patterns for security choices for each tier, and using software defined infrastructure so that a different, independent, secure research environment can be instantiated for each project appropriate to its classification, we hope to maximise researcher productivity and minimise risk, allowing research organisations to operate with confidence.

Type

Design choices for productive, secure, data-intensive research at scale in the cloud

Abstract

Sebastian Vollmer

Professor for Applications of Machine Learning