• Login
    View Item 
    •   SU+ Home
    • Research and Publications
    • Faculty of Information Technology (FIT)
    • FIT Projects, Theses and Dissertations
    • MSIT Theses and Dissertations
    • MSIT Theses and Dissertations (2017)
    • View Item
    •   SU+ Home
    • Research and Publications
    • Faculty of Information Technology (FIT)
    • FIT Projects, Theses and Dissertations
    • MSIT Theses and Dissertations
    • MSIT Theses and Dissertations (2017)
    • View Item
    JavaScript is disabled for your browser. Some features of this site may not work without it.

    Optimized terasort algorithm for data analytics: case of climate data analysis

    Thumbnail
    View/Open
    Fulltext thesis (1.805Mb)
    Date
    2017
    Author
    Matu, Fiona Mugure
    Metadata
    Show full item record
    Abstract
    Weather forecasting has proven valuable in unravelling the causes of the occurrence of natural phenomena and predicting of future climatic conditions. Subsequently, better preparation and policy making regarding these occurrences can be done using resultant information from techniques employed in weather forecasting. Analysis of vast amounts of data are characteristic of climatology hence require computing intensive techniques such as numerical weather prediction (NWP). This has made climate modelling a preserve of high performance computing (HPC) until the recent entrance of big data analytics. It is therefore necessary to optimize the algorithms used in the big data environment so as to give comparable performance to that offered by HPC environments. The study aimed at improving the big data MapReduce framework of analysis by optimizing the TeraSort benchmark algorithm. The algorithm proposed employed classical sort techniques and incorporated quantum computing mechanisms. Historical weather data collected at weather stations across the world was gathered and converted into organised, human readable format to suffice as input to the program. The proposed algorithm constituting of a map, sort and reduction phase transformed the bulky observational data into a compact summary of monthly temperature averages in linear complexity. This is a significant improvement in performance in comparison to the TeraSort algorithm on a single node. The study concludes by suggesting areas that may be explored for further optimization with emphasis on quantum computing capabilities.
    URI
    http://hdl.handle.net/11071/5634
    Collections
    • MSIT Theses and Dissertations (2017) [34]

    DSpace software copyright © 2002-2016  DuraSpace
    Contact Us | Send Feedback
    Theme by 
    Atmire NV
     

     

    Browse

    All of SU+Communities & CollectionsBy Issue DateAuthorsTitlesSubjectsThis CollectionBy Issue DateAuthorsTitlesSubjects

    My Account

    Login

    DSpace software copyright © 2002-2016  DuraSpace
    Contact Us | Send Feedback
    Theme by 
    Atmire NV