Life Sciences R&D, pharma, biotech are full of challenges, great ambitions and outcomes for the patients. Data quality, Data value are one of these challenges. Often forgotten or left aside, effective data Stewardship is crucial for any business.

Here are some pointers and tips to help care -or shepherd, as I call it- for your data.
Using the FAIR and ALCOA+ principles, and what they stand for, will help you and your business goals to make informed decisions and advance your science from your data.
F is for Findable
The data must use agreed identifiers, terms and dictionary in its rich metadata and include clear lineage between the pieces of data. Personal jargon and identifiers, and not referencing other pieces of data highly diminish the ability to find, use and reuse the data.
A is for Accessible
Data silos are a common problem in the scientific world and all around. Breaking them, opening them to connect to other data systems enable better flow of the data (see my previous article for further thoughts). You need to consider permissions: Open Science is a great concept that I encourage but the reality may require access restrictions for business reasons or/and IP protection. Consider what restrictions you are adding to your data very carefully and the impact they will have. Review regularly!
I is for interoperable
In a scientific lab, you use a lot of instruments and systems. Their output is often in a proprietary format that makes the data difficult to read both by humans or machines. Changing the output to formats that use industry standards such as the ones created and promoted by the Allotrope foundation can really improve your data process.
R is for Reusable
Quality data that includes metadata, context, methodology, traceability, may be reused for further applications therefore adding more value.
By supplementing to the FAIR principles above by ALCOA++, you can further increase the quality and value of your data.
A is for Attributable
Tracing the information to its source and the physical assets that were involved in its creation, persons, instruments, samples is an important criteria to determine if the data is trustworthy and applicable for the purpose you need it.
L is for Legible
This is very similar to Interoperable described above. You also want to consider the language you are using: personal jargon and/or abbreviations will be difficult for someone else to read and understand -> Avoid.
C is for Contemporaneous
When was the data produced? Are there been some new data points produced for this sample? Are you using the latest version of a document?
O is for Original
Are you using the original version of a document or a copy? Multiple versions of the same documents can be the source of errors… and big headaches! Using document management tools could help. Are you copying some work already published?
A is for Accurate
There is no need to explain this one: if the data is inaccurate how can you make any decision? Garbage in, garbage out!
++ is for Complete, Consistent, Endurable and Available
Missing pieces of data, inconsistency, data “disappearing” will impair the quality of your data therefore decision. Availability is similar to the Accessibility already discussed above.
Final thoughts
This list can be overwhelming but it is really common sense. Experienced data stewards can help you navigate your data landscape to enable you to make better decisions for your research, project and business. Remember to define your goal first and make sure that you are working towards it, not against it!
Contact us to discuss how the Spotty Alicorn can help your digital data journey.
All rights reserved © SALDS 2025
Leave a Reply