Data Security Whitepaper
Introduction
This paper serves as an overview of Caseflow's security and compliance with current regulations within the European Economic Area (EEA). Throughout this paper, the most common customer concerns are addressed relating to security, privacy and retention. This paper outlines contractual agreements with third parties and elaborates on data assets stored in relation to data subjects on the platform and how this information is protected. Furthermore, this paper explains the steps taken in order to ensure adherence to currently applicable GDPR regulations.
Terminology
The General Data Protection Regulation (“GDPR”)
The General Data Protection Regulation (“GDPR”) came into force on 25 May 2018. It focuses on strengthening and unifying data protection for all individuals within the EU. The GDPR has a broad reach; it applies to organizations located within the EU, as well as organizations outside the EU if they offer goods or services to people residing in the EU.
This white paper is not intended as legal advice and shouldn’t be considered as such. We strongly recommend that you seek legal advice if you’re unsure as to how the GDPR affects your company (the data controller) and your use of Caseflow (the data processor).
Compliance With The GDPR
Introduction
This paper serves as an overview of Caseflow's security and compliance with current regulations within the European Economic Area (EEA). Throughout this paper, the most common customer concerns are addressed relating to security, privacy and retention. This paper outlines contractual agreements with third parties and elaborates on data assets stored in relation to data subjects on the platform and how this information is protected. Furthermore, this paper explains the steps taken in order to ensure adherence to currently applicable GDPR regulations.
Caseflow's User Model
Application level
Caseflow's services introduce 4 distinct types of users who can access data through the application:
Organizers
Admins
Participants
Judges
These users have differing purposes, and thus, the nature of data stored about these users differ. The following present information processed and/or stored about each type of user.
Organizer
An Organizer is an Caseflow employee or representative in charge of initializing an event, and inviting (per email) Admins to produce content and manage activities related to that event, with Caseflow's prior consent.
The organizer has to provide a full name, email, and company name on registration – in order to initially create the company and the event that the Admins would manage. Organizers can also produce content and manage activities related to the event. However, organizers do not have access to all events – but only to the ones they created. For these events, organizers can see all the participants that have joined and all the case material provided by them, as well as the ratings for that material provided by the judges.
Admin
An Admin is an individual appointed to represent an organization by appointing Judges and producing content for an event. An Admin has the privileges required to add and modify the profile of the organization they represent. Examples of data fields that an Admin can add/modify are: company profile picture, company name, company website, foundation date, industry, company size, etc.
At registration time, admins have to provide a full name and email address. Caseflow does not need or store any more data about admins.
Admins are able to view all participants that have joined the event, have access to all the case materials that were provided for their event by the participants and can see the average score given by the judges for each case material.
Participant
A Participant is the most numerous type of user. Participants sign up to the Caseflow platform. In order to sign up, a participant must provide an Email address. After signing up, a Participant can (non-mandatory) add information to their profile page. A non-extensive array of information that a profile page can contain would be: profile picture, skills, resumé, DoB, gender, birthplace, current residence, educational background.
Participants do not have access to the case material of other participants in the event, nor to the list of judges and admins of the event. Participants cannot see their score unless the admin decides to reveal it. But even then, participants can only see their own scores and if the admin decides to reveal it; an average across all participants, i.e. not specific participants.
Participant’s profile is public on Caseflow – meaning any user, if they know the correct URL of the participant, can see his profile, and with that, the data the participant provided. It is worth noting that this URL also contains GUID. This is done purely for business purposes, as this is a talent acquisition site. Aware of the potential risk, Caseflow can easily switch this feature off – and in turn make the participant profiles visible only to the organizators and admins of the events the participant is participating in.
Judge
A Judge needs to provide only name and email on registration. A user can only become a Judge per invite by an Organizer or Admin – this happens via email. A Judge is permitted to read, review and rank case material submitted by Participant(s). Furthermore, a Judge has the ability to provide feedback to Participants.
Judges do not have access to the participant data as it is anonymized for them. Judges also cannot see the admins and organizers of the event, and cannot manage event content.
Purpose of User Model
The purpose of Caseflows user model is to ensure that the hassle related to signing up for multiple case competitions and hackathons is minimized. The user model is also implemented in order to distinguish between the roles of different users – some may require more/less privileges than others (e.g. Judges). By processing the data outlined in the user model, Caseflow is able to deliver services that are aligned with the interests of Caseflows users. This holds for Participants, but also Organizers, who often represent an organization that seeks knowledge about a specific problem. Caseflows platform allows such organizers to allow for further communication and support during a case solving scenario. Caseflow requires only the bare minimum information (full name and email) from users to ensure uniqueness of users and entities. All other information on the platform is provided voluntarily.
Caseflow does not store any kind of password – every time a user (organizer, admin, participant or judge) wants to login, they receive a unique link valid for a certain period of time on their email. They need to click the link in order to enter the Caseflow platform. Every new login generates a new unique link, invalidating previous links in the process.
System level
There are specific types of users who can access data through the infrastructure. These users are trusted members of the authorized parties and the Caseflow development and management team. Some developers and the technical lead are located outside of EEA but none of these Caseflow employees have access to the personal data of our customers or users:
Technical lead
Developers
Management team
Technical lead
The technical lead is the person, a member of the Caseflow development team, who is responsible for setting up the CI (continuous integration) and CD (continuous deployment) pipelines on Azure (automatic builds and deploys of new application versions) and monitoring of applications infrastructure on Azure (logs, health) without access to the application data.
Developers
The developers of the Caseflow platform have no access to the infrastructure, no access to the database and user files.
Management team
The management team is Caseflow employees who work from EEA. They have access to the users’ data through dashboards in PowerBI in order to analyze statistics and prepare reports for customers.
Caseflow Privacy Policy
Caseflow is based in Denmark, a part of the EEA. As such, Caseflow technology, systems and processes adhere to relevant legal requirements in terms of data privacy. Caseflow Privacy and Cookie Policy reflects this.
Information Access Control
The services and databases are subject to rigorous security operations protocols provided by Caseflow. Caseflow ensures that only employees in EU has the required credentials to obtain access to these services – if absolutely necessary.
Furthermore, an added layer of security is ensured by denying access to unknown IP addresses and Customer Lockbox for Azure has been enabled. The Azure servers that host the services, maintained by Caseflow, ensureISO-27001 Certification and provide the capability of SOC2 third-party audit reports per request by a customer under an appropriate NDA.
Storage of Data
Caseflow's user model implies that information provided by a Data Subject is persisted on databases that are accessible from linked application servers. Data is stored on Azure database infrastructure (West Europe).
Caseflow chose Azure cloud service providers because they offer commodities such as scale-up for huge spikes in traffic, computing power, deliverability over CDN to name a few. Azure is one of the biggest cloud providers, servicing and hosting many systems. In terms of data privacy and compliance, Azure is one of the leading providers:
Securing the Infrastructure
Servers and database
All application servers and database servers run inside a virtual network and the access to them is regulated by role based access (RBA). Secure access are extensive firewall rules that explicitly allow access to the resources in the virtual network. Rules are set up in such a way that only the application servers can access the database. All other access is blocked – even if you have the correct link, username and password, you still cannot access the database. Application servers can only be accessed by the load balancer and CD infrastructure. All other access is blocked – even if you know the correct IP address or URL to the server. The load balancer is the only entry point from the public internet, and it enforces HTTPS only traffic. This makes sure that the data is encrypted, is not tampered with and that we talk with the intended users (protecting from attacks such as man in the middle). It should be noted that the database server cannot be accessed from the LoadBalancers even though they share the same virtual network.
Caseflow chose to tighten the access security measures in order to mitigate the risk of unintended errors happening. We are aware that not only malicious users can pose danger to your data, but even developer error, such as setting wrong URL by mistake or executing wrong command to production instead of beta, can also damage and leak data.
Files storage
Files that the users upload are stored on Azure storage servers, and are protected with secure key. This means that if some user wants to upload/access a file, they need to have a special pre-signed key issued by the application servers, and it only issues keys that expire after a certain time, to users that are authorized and limited to their current public IP address so they cannot share the pre-signed key.
CI/CD
Publishing, or releasing, new versions of the application is also a potential security risk if done manually – which means the operations team needs to have access to the infrastructure parts. However, Caseflow has completely automated that process with modern CI/CD principles and does not require any human interaction. This means that no developer or operations engineer has access or credentials to the infrastructure. The Continuous Deployment pipeline takes care of publishing the application on the servers in the protected VPC.
Passwords, certificates and any other access information is never stored in the source code. Upon deployment, the Continuous Delivery system packages the application with encrypted keys that are not available to the developer team.
Many attacks and leaks of data are happening because some systems, even if they take many security precautions, still depend on other packages that have security issues – and hackers can leverage those weak links and access forbidden resources. Caseflow uses DevSecOps principles to some extent to ensure that outdated dependencies in our system are promptly updated. The automatic release pipeline will fail if it detects such dependencies – forcing the dev team to update them. Usually, new versions of packages patch up and fix security vulnerabilities in addition to new features – and as long as we stay up to date with them, we limit the number of attacks that can happen.
Caseflow uses Azure DevOps server, cloud hosted, for all CI/CD pipelines, as well as git for code version control. The automated builds and releases are done on Microsoft hosted build agents. In order to access the Azure DevOps server or the build agents or code base or the CI/CD pipelines you need to be added as an Caseflow team member of the Caseflow project in the Azure DevOps server. Azure DevOps server offers, apart from CI/CD processes, a whole toolset for managing projects, from taskboards, team members to code versioning, package feeds, pull request and build policies, all in one place, instead of using separate tools for each.
Data Protection Measures
As the data processor and data controller, Caseflow is tasked with implementing appropriate technical and organizational measures to ensure data protective measures . These include, but are not exclusively limited to:
Ensuring encryption at rest for personal and all data, by utilizing the Azure option fortransparent data encryption for SQL Database
Ensuring confidentiality, integrity, availability and resilience of processing systems, by applying security measures mentioned in this document, having multiple instances of the servers, applying retry mechanisms on application level
Restricting who may access personal data, by only lead engineer having access to the infrastructure, in order to set it up, apply fixes and changes to the configuration
Ensuring availability and access to personal data in the event of a physical or technical incident, by being able to access the database and provide reports in case of application server incident, and by regularly backing up the database and being able to restore it in case of database server incident
Performing regular testing and evaluation of the security of the Caseflow platform with bi-weekly scheduled testing by the developers, and updating the application and infrastructure with security features
Application level
Caseflow ensures data protection measures by using carefully chosen technology stacks and leveraging their security measures, as well as applying security checks and strict validations throughout the code. Also, the application is architected in a way that it consists of front-end and back-end. The backend is hosted on the infrastructure described in the previous sections. The front end is served to the users machines and is executed there. That’s why we do not store any secret information on the front-end, and use it exclusively for presenting data served by the back-end. The back-end makes sure that all data that is served to the correct users that have access to it.
Front end
Angular has a steep learning curve, however, it offers great security measures out of the box and is regularly updated and maintained by Google as it is used by most of their internal systems.
By using Angular as the front end, we gain all security measures that the angular framework offers out of the box by preventing cross-site scripting, cross-site request forgery, cross-site script inclusion, and also sanitizing user input to name a few.
In order to be protected against such attacks, Caseflow follows the guidelines by not generating HTML from a user input, not changing the original angular code, keeping up to date as soon as viable, not interacting with with the DOM directly, using the angular’s provided HttpClient and so on. By using angular’s routing module for all application internal routing, we protect against open redirect attacks.
It is because of all previous benefits that Caseflow has utilized Angular for the development of the front end for Caseflow.
Caseflow protects users from social hacking by requiring them to open a link that we send them by email every single time they login to the application, a link that is valid only for a limited time – meaning potential attackers need to have access to their email inbox as well. Having security on the front-end is far from enough to have a protected data set, as malicious users can always use other tools to execute endpoints on the server side directly, thus skipping the front-end entirely.
Back end
Authentication: The back-end is developed in .net core, which enables Caseflow to rapidly develop new features. It provides patterns that ensure the authentication and authorization of every request. We follow the middleware pipeline delivered by Microsoft which validates the tokens and fails the request (401 – unauthenticated) immediately, without a single line of business logic or data query code executed. This filters out any possible attack of malicious users that are not authenticated.
Authorization: The authorization is based on the .net policy engine which denies access to all users (even authenticated) if they don’t have the required roles and access permissions. In essence, Participants can only access their area (scope) of the API and will be denied access to the Event Organizers REST endpoints.
CORS: We allow traffic only from specific domains by using CORS policy – we do not allow requests from any other domain unknown to us. Since we host multiple competitions on the same domain (challenge1 and challenge2 are both hosted on *.caseflow.io) we need to be careful of attacks such as session hijacking. However, since we use token-based and not cookie-based authentication, malicious scripts cannot affect the cookie of another session, since it does not exist and is not relevant to our application.
Database: Access to the database is protected from malicious users by never executing user input directly. All user input has to pass through the validation stage, where invalid requests fail fast. Next validation is on the business layer where user input is converted to a domain object where business validations take place – failing again if some input is invalid. If all validations pass, then the queries are executed by an ORM which ensures that we are protected from SQL injection attacks by executing all database access through parameterized queries. Additionally, we employ code linters that do not allow commits of code that have Raw SQL queries.
Code review practices
Some of the above measures are valid only if the developers follow standards – and here there is always a chance that someone does not follow those standards (human error). To mitigate this issue, Caseflow has strict code reviews in place where all team members need to approve new code before it enters the codebase. There are policies declared on the repository, so it is impossible (forbidden) for new code to directly enter the ‘dev’ or ‘master’ branches (as we use gitflow branching strategy) without a pull request. And for a pull request to be completed, or merged into the dev branch – it has to satisfy the policies. One of the policies is that all team members have reviewed and tested the code, and that they approve of it. Only then it can enter the codebase and can be published to production.
Vulnerability management procedures
Caseflow is aware that sooner or later vulnerabilities and other threats will be discovered in the system. That’s why procedures are defined as what has to be done in this case. On the infrastructure level, Caseflow is using PaaS, so we are sure that servers are up to date all the time, and that is a responsibility of the cloud provider.
Detection
Vulnerabilities can be detected by the CI/CD pipeline in the form of outdated packages with confirmed vulnerabilities, from the monitoring of the Caseflow application with Application Insights, where every request, successful or failed, is logged and can be inspected, during testing of the application from the development team or reported by application users.
Active penetration testing is done as part of the CI/CD process, using Whitesource.
Monitoring
Caseflow uses Application Insights, a service from Microsoft which integrates finely with the back-end. It provides a wide variety of log and monitor features, including request logging, parameters, execution time, number of requests, alerts if a request is taking too long and even suggests improvements.
Application insights also alerts if a new failed request occurs, providing the stacktrace so the developers can easily fix the issue, and we can be alerted when a new issue occurs instead of occasionally checking if there is one.
Prioritization
As soon as a vulnerability is detected, it is prioritized by the project manager and the team leads.
The prioritization is done by assigning a severity level on the issue – which is simply a value from a scale of 1-4, where 1 is “critical”, and 4 is “low priority/nice to have”.
Patching
A severity level “critical” means that developers stop whatever they are doing, and immediately investigate and fix this issue. This is then tested locally, and if successful, published and tested to the beta environment, which resembles the production environment. If the tests are successful on beta, the patch is released to the production environment and is available to the general users.
Another severity level might mean that it will be fixed during the current sprint, or the lowest severity level means it can be moved to the next sprint.
Regardless of the severity level, all issues are tested and released to production in the same way, described above.
Concluding Remarks
Caseflow prides itself on maintaining top-of-class security of its services. Addressing any and all privacy concerns of our customers is among our top priorities. Caseflow has carefully studied the current applicable data security laws and guidelines.
We appreciate you taking these issues as seriously as we do. If any questions or considerations have gone unanswered in this article, we will gladly assist you further. Contact information can be found on caseflow.io.