This joint session includes two talks from RV College of Engineering.
Implementing Conditional Cleanup after Regression Testing in HPCC Systems - Goutami Sooda, Arya Vinod, Ahana Patil, & Chandana S, RVCE
The HPCC Systems employs a regression testing module aimed at validating new code additions and ensuring their seamless integration into the existing codebase. The regression test engine executes a myriad of test cases, resulting in the generation of a substantial number of workunits within the HPCC Systems cluster. In cloud-based environments, the accumulation of these workunits poses a potential issue, consuming significant resources. This situation not only translates into increased operational costs but also raises resource utilization concerns.
To address this challenge, our project introduced an automated cleanup mechanism for the deletion of workunits after each run of the regression testing. The cleanup parameter allows users to specify the mode of deletion while running the regression test which can result either in the deletion of all workunits, only those created by passed test cases or no deletion. Additionally, the cleanup module generates custom log files to facilitate debugging and troubleshooting.
The cleanup module designed and developed during this project allows users to choose the cleanup mode and when enabled can effectively delete thousands of workunits generated by regression tests across different clusters, including Thor, Roxie, and hThor. This feature has yielded tangible benefits, including reduced operational costs, improved resource utilization, and enhanced overall efficiency of the regression test engine. By preventing overload on both the cluster and the Dali component, we have significantly enhanced cost-effectiveness and streamlined resource management within the HPCC Systems environment. Overall, this project focuses on contributing to the extensive codebase of the regression test engine.
Target Audience -
• Developers and users of HPCC Systems regression test engine.
• Tech enthusiasts who wish to stay updated with the latest developments in HPCC Systems.
Data 360° View using HPCC Systems - S Dhanush, Shreyas Shankar, RVCE
In today's data-driven landscape, organizations face significant challenges in managing and leveraging large volumes of data across diverse platforms. This session presents a cohesive solution developed by RVCE team for ProfitOps Inc., a startup company based out of Cumming, GA, USA, using HPCC Systems to achieve comprehensive data integration and transformation, ensuring seamless connectivity across MySQL, MongoDB, AWS, FTP, and other platforms.
HPCC Systems serves as the core technology for this solution, facilitating streamlined data ingestion, synchronization, transformation, and versioning processes. Whether ingesting data from the HPCC Systems landing zone to MySQL or vice versa, the solution supports bidirectional data flows with robust mechanisms for incremental updates, deletions, and version control across various data formats.
The MongoDB ECL plugin enhances the solution by not only enabling CRUD operations but also ensuring efficient data push and comprehensive versioning through advanced change-tracking techniques. AWS integration automates real-time data synchronization between S3 buckets and the HPCC Systems cluster, thus enhancing file management efficiency and maintaining data integrity with robust versioning strategies.
Additionally, a secure FTP server setup ensures reliable data transfer to the HPCC Systems landing zone, optimizing upload processes based on real-time detection mechanisms. Within the data transformation and the 360° view module, ProfitOps Inc. has incorporated generalized functions for currency conversion, data normalization, and string data homogenization.
These capabilities are integrated into a modular schema management framework, ensuring consistent and accurate data representation across diverse data sources. Inspired by OpenRefine, advanced transformation functionalities handle complex data cleaning and transformation tasks, thereby improving data quality and uniformity across integrated platforms.
This session offers practical insights into optimizing data workflows with HPCC Systems, enabling efficient, reliable, and scalable data operations across multiple platforms. Such an integrated approach empowers organizations to leverage their data assets effectively, driving informed decision-making and operational excellence in today's competitive landscape.
Target Audience
• Professionals responsible for designing and implementing data workflows and integrations.
• Individuals requiring efficient and accurate data retrieval and transformation capabilities for analysis.
• Companies managing large datasets across various platforms, requiring efficient data integration and transformation, such as financial institutions, healthcare providers, and e-commerce platforms.
© 2024 LexisNexis Risk Solutions