Unlocking biomedical data for AI health research in Africa using GeneNetwork

dc.contributor.authorKilyungi, B. M.
dc.date.accessioned2026-04-28T17:51:30Z
dc.date.issued2025
dc.descriptionFull - text thesis
dc.description.abstractGenetic data analysis is essential for understanding biological processes and diseases. GeneNetwork (GN), an open-source platform with over 20 years of genetic and phenotypic data, relies on a complex relational database. However, the data is currently difficult to access and manipulate due to its complex underlying structures, including around 80 cross-referenced Structured Query Language (SQL) tables and various file types. This dissertation aimed to address the limitations of the GeneNetwork2 SQL database in representing and querying graph-like biological data by transforming it into the Resource Description Framework (RDF). A self documenting Domain Specific Language (DSL) was developed using GNU Guile to automate the conversion of GN’s MariaDB SQL database into RDF triples. This involved defining ontologies, mapping SQL views to RDF, and storing the data in Virtuoso. The framework’s effectiveness was evaluated by comparing query performance and output quality between SQL and SPARQL. Results showed that RDF transformation significantly improved query efficiency and semantic richness. At a 99.9% confidence level, SPARQL queries exhibit statistically significant faster execution times than the equivalent SQL queries. Additionally, RDF’s structured representation enabled intuitive querying and better relationship discovery, as demonstrated in retrieving mouse species details and searching GeneRIF entries. In conclusion, transforming GN’s data into RDF made complex queries faster and enhanced its FAIR (Findable, Accessible, Interoperable, Reusable) properties, improving accessibility through semantic enrichment and interoperability with federated services for both human and machine agents. This transformation unlocks the full potential of the data, laying the groundwork for a more adaptable, AI-ready GN service and providing valuable insights for the broader application of RDF in biological and clinical data integration. KEYWORDS: Artificial Intelligence, Data Accessibility, Data Interpretation, GeneNetwork, Biological Data, Data Discovery, Resource Description Framework (RDF), Metadata
dc.identifier.urihttps://hdl.handle.net/11071/16490
dc.language.isoen
dc.publisherStrathmore University
dc.titleUnlocking biomedical data for AI health research in Africa using GeneNetwork
dc.typeThesis

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Unlocking biomedical data for AI health research in Africa using GeneNetwork.pdf
Size:
655.59 KB
Format:
Adobe Portable Document Format

License bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
license.txt
Size:
1.71 KB
Format:
Item-specific license agreed upon to submission
Description: