Herd - a managed data lake for the cloud
The Herd unified data catalog helps separate storage from compute in the cloud. Manage petabytes of data and make it accessible for data processing and analytical purposes by any cloud compute platform.
A centralized, auditable catalog for operational usage and data governance.
Capture data ancestry for regulatory, forensic, and analytical purposes
Create and launch clusters; load data into clusters from catalog entries
Orchestrate clusters and catalog services to automate processing jobs
Just released! Herd-UI, a search and discovery tool for business and technical users.
We encourage everyone who has an idea to fork the code, experiment and share their experiences with us through our GitHub Issues.
If you believe that you have a worthwhile contribution, please open an issue on GitHub and explain your idea.
The herd team will review your idea and prioritize or start a discussion about the issue.
If the issue is agreed upon, start coding.
Remember to write unit tests to maintain our code coverage.
But make sure you don't have any passwords or encryption keys from your environment in your code!
Once you have written your code, please make sure to sign off your work when you commit it.
git commit -s -m 'YOUR COMMIT DESCRIPTION'When you signoff, you are agreeing to the following:
git commit --signoff -m 'YOUR COMMIT DESCRIPTION'
Developer's Certificate of Origin (adapted from the linux kernel) By making a contribution to this project, I certify that:
(a). The contribution was created in whole or in part by me and I have the right to submit it under the open source license indicated in the file; or
(b). The contribution is based upon previous work that, to the best of my knowledge, is covered under an appropriate open source license and I have the right under that license to submit that work with modifications, whether created in whole or in part by me, under the same open source license (unless I am permitted to submit under a different license), as indicated in the file; or
(c). The contribution was provided directly to me by some other person who certified (a), (b) or (c) and I have not modified it.
(d). I understand and agree that this project and the contribution are public and that a record of the contribution (including all personal information I submit with it, including my sign-off) is maintained indefinitely and may be redistributed consistent with this project or the open source license(s) involved.
Once you have completed the commit, you can make a pull request referencing the initial issue created for the work.
The herd team will process your pull request in a timely fashion and may perform additional testing. You will receive feedback or a notification that your code will be accepted with an indication of the timeframe for acceptance.
All activity regarding the issue including contributor and herd team discussions will occur within the GitHub issues system, so check back frequently and watch issues that interest you.
When you encounter bugs, please try to find out if the issue has already been reported by searching here. Once you identify that your issue is new, create a new ticket. Please make sure you provide a clear description of the issue along with a testable scenario.
If you want to get involved in fixing bugs, first read the steps outlined above under "Contribute". Then, you can assign yourself to the bug as the owner and start developing your code. Once you complete build and test of your code, please make a pull request for review. Once the review is complete, your code fix will be merged into master branch within the herd repository.
We are actively seeking organizations and individuals that are interested in adopting herd and contributing to the development effort. If you want to get involved, you can start by getting to know herd on the wiki or the GitHub project. Or don't hesitate to reach out to herd@finra.org.
If you have any questions or discussion topics, please post them on GitHub Issues.
Team | |||||
Evan 'Levi' Allen |
Kapil Agarwal |
Mona Annaparthi |
|||
David Balash |
Michael Chao |
Aniruddha Das |
|||
Sundari Diwakarla |
Shane Ebersole |
Arthur Felde Man of Mystery |
|||
Thomas Frank |
Pragnya Gandhi |
Tim Griesbach |
|||
Patricia Hu |
Mahesh Kambli |
Taras Katkov |
|||
Karishma Patel |
Andrew Pach |
Max Seo |
|||
Kumar Siddhartha |
Val Sorokine |
Keni Steward |
|||
Bala Sundaramoorthy |
Sai Suryanarayanan |
Wayne Wang |
|||
Nate Weisz |
Jen Wenner |
Greg Wolff |
|||
Jim Zhang |
Sponsor | |||
The FINRA developer community is actively supporting the herd project. FINRA has graciously allocated time for their internal development resources to enhance herd –and encourages participation in the open source community. http://www.finra.org | http://technology.finra.org Want to join FINRA? Visit finra.org/careers |