Documents
eBook: Navigating Government's Big Data Journey
Each chapter of this e-Book will focus on an essential element for development of a comprehensive data strategy.
Solution Brief: Mobilizing the Military with Data Visibility
This white paper addresses how actionable data can make a difference for national defense efforts.
Infographic: What's Holding Back Defense Logistics?
This piece lays out the biggest data challenges facing government today and outlines key solutions.
Videos
Download the slides
A View from the Hill: Defending Vulnerable Populations in the Digital World
Download the slides
Perspectives on Ethical Big Data Governance
Download the slides
The Essentials of Apache Hadoop
Download the slides
Precision Medicine in the Big Data World
Download the slides
Relying on Data for Strategic Decision-Making - Financial Services Experience
Download the slides
The Role of "A" in SMAC
Podcasts
Big Data in Government, Financial Services and Health Life Sciences
Precision Medicine in the Big Data World
Data Encryption and Security
Blog Posts
Key Messages from the 5th Annual Cloudera Government Forum
By William Sullivan, Vice President, Public Sector, Cloudera
The 5th Annual Cloudera Government Forum agenda focused on the theme Technology Revolution Driving Business Evolution. The speakers included government professionals and their industry partners, who shared their individual and collective insights during the March 15th program.
If you missed all or portions of the live program, you still have the opportunity to view the sessions on demand and see if you agree on these key take-aways:
1) The State of Government Big Data Management and Analytics
Clearly big data has moved out of the exclusive environments of the research labs and university data centers and into many agencies. Today’s public sector technologists and program managers are leveraging the data they have to support diverse requirements, better align resources and skill sets to the current mission, and look at data collection and analytics in ways that used to be impractical or impossible.
Attendees welcomed the broad participation of engaging speakers from across the government including perspectives from the CIA’s Deputy Director for Digital Innovation Andrew Hallman, Treasury CIO Sonny Bhagowalia, and White House Chief Data Scientist D.J. Patil. They talked not only about today’s use cases, but also the future and how to prepare agencies to take advantage of future decision support tools and training.
2) Hadoop and How it Helps Agencies Address Big Data Challenges
Attendees at the 2016 Cloudera Government Forum ranged from those familiar with applying advanced data analytics to enormous datasets comprising a variety of data types and sources, to those just becoming acquainted with Hadoop and how it can become an integral tool to support agency analytics.
Participants enjoyed an executive overview of the Apache Hadoop ecosystem. This included illustrating basic concepts, glossary of terms and real-world applications to support current agency requirements for business and mission support.
3) Data Analytics for a Range of Applications
Forum attendees heard from agency business leaders what they are doing with their data and how they are gaining a data-driven understanding of their most fundamental operational considerations. Case studies discussed the results agencies are seeing by using data analytics for mission-specific outcomes—from defending vulnerable populations to strategic decision-making in the financial services sector.
Several sessions challenged attendees to focus on difficult but essential topics such as Ethical Big Data Governance, where common sense guidelines were offered by Steve Totman, Cloudera’s financial services industry leader. Mr. Totman highlighted the need for agencies to inventory their data before starting to implement a modern technology foundation to ensure an understanding of its related legal frameworks, define usage guidelines and privacy policies, and publish those guidelines.
4) Precision Medicine — the Future of Medicine?
One especially exciting application for big data analytics is the Precision Medicine Initiative (PMI), announced by the Administration in 2015 to “revolutionize how we improve health and treat disease…(with) a new model of patient-powered research that promises to accelerate biomedical discoveries and provide clinicians with new tools, knowledge, and therapies to select which treatments will work best for which patients.” Learn more about Cloudera’s support of the PMI.
During the Cloudera Government Forum, Cloudera’s industry leader for health and life science, Shawn Dolley, explained this new approach to disease prevention and treatment that takes into account individual genetic variation, and he provided an overview of the National Institutes of Health program that aims to gather sufficient data to put this practice into place clinically.
5) Strategically Applied, Data Can Save Lives
At the end of the day, NYU’s Beth Simone Noveck, also co-founder and director of The Governance Lab, gave a keynote that addressed what she called the most important party in the new data frontier: citizens.
She spoke to just how dramatically an average citizen can benefit from a collaborative and strategic application of data in a community. For illustration, she mentioned Pulse Point, a smartphone application that connects volunteer first responders into civil emergency response systems to provide broader and more timely assistance to high-risk citizens, such as heart attack victims.
“Technologies of data are important, but so are technologies of expertise,” Noveck said. “They give us the opportunity to play our parts, get involved, and do well by doing good.”
6) A Forum Designed for Peer-to-Peer Collaboration
The idea that agencies and their partners should get as involved as they can with data underscored every speaker at Cloudera’s 5th Annual Government Forum. Those who are successful develop an intimate understanding first of the power of data and the responsibility they have over the data they collect, create, store and manage. From there, measurable transformation becomes possible. I encourage you to take a few minutes to explore the remarks, videos, and podcasts from the 2016 Cloudera Government Forum and to share your feedback regarding “essential content” for next year’s program.
Looking to government’s future with Hadoop and celebrating its past
An observation in Hadoop’s 10th year: how it has changed and where it will go
By Webster Mudge
Our relationship with data – whether as a private firm or a public agency – has only grown more intimate with time. And, as Cloudera Co-Founder and CSO Mike Olson pointed out recently, Apache Hadoop has spent the last 10 years cultivating that relationship, helping organize and transform data and data usage from healthcare to defense to intelligence to finance.
Mike took a look at some of the ways Hadoop will be changing over the course of the next decade. It is important, though, to provide context for these projections and review the foundations of Hadoop; in short, to answer the question: what is Hadoop?
For agencies that use data, it’s is increasingly a critical component of their infrastructure. And it often seems as though many know Hadoop is important — but few know why.
That answer begins at Hadoop’s very origins. Hadoop started as a single data processing framework built atop a scalable and economical storage layer. It was designed to bring larger and more varied data sets together and address data processing at Internet-scale, and when it debuted, web companies – those that really did have “big” and “varied” data problems – gained exciting new capabilities that could render deeper insight from this broader set of data, offering more analytical power and breadth than they’d ever encountered prior.
Yet arguably the most critical element to its future adoption and capability – Hadoop’s “change agent,” sort of speak – was the manner in which Hadoop-stored data promoted an agility that allowed the interpretation and re-interpretation of a single set of data without recourse. Data at scale could be viewed through as many lenses as needed or even the same lens ever so slightly adjusted to highlight or pull forward a previously unimportant or unknown data point, all without changing the underlying source data, thus providing the foundation for an extended ecosystem of multiple analytical tools, frameworks, and applications.
These foundations have since solved more than a few problems for government, and Hadoop has evolved to include a myriad of data processing, analytic, and storage capabilities to help meet the mission.
Today, the proof of Hadoop’s capacity for change can be seen in its ecosystems’ growth and evolution. For example, Apache Spark, a flexible data processing engine, brings easier development and faster processing to Hadoop, and is primed to replace the traditional MapReduce processing framework. Part of Spark’s power comes from its ability to digest and derive intelligence in highly iterative and programmatic forms, like machine learning algorithms. This is an important development in computing, as many workflows and data sets within the agency fit this model of processing, and Spark also extends to address many emerging stream processing needs.
This will prove helpful in efforts like the Precision Medicine Initiative, a White House program through which medical professionals are using Hadoop in its entirety to catch the “tidal wave of data” flooding in from genomics, proteomics and all sorts of other -omics. With tools like Spark and Apache Impala (incubating), the fast, analytical SQL engine, they can begin to understand medical data from histories of diverse patients, ultimately to better diagnose and treat disease.
The same data-gathering process can apply to business transformation. With a focus on efficiency and security, organizations can use advanced analytics to drive timely and cost-effective decision-making to empower their mission and business objectives in areas such as insider threat, fraud detection, and supply chain optimization,
In short — as tools for using it have advanced, so has data’s status as an asset.
So, where is Hadoop headed? Wherever it needs to go to help people, organizations, and agencies in business and mission. Bright people still are drawn to its vibrant development communities to contribute both insight and code, and the open source nature of Hadoop stokes the fires of this continuing growth.
Cloudera has a special perspective to this future state, by way of a partnership with Intel, that considers the changing hardware landscape that’s going to support these advances. For example, the physics of data storage that were common knowledge ten years ago are essentially being rewritten by new capabilities like Intel’s 3D XPoint (pronounced crosspoint), a technology that will allow incredible amounts of information to fit into incredibly dense yet fast spaces.
3D XPoint epitomizes data transformation bridging both storage and processing. Orders of magnitude more capable than memory structures of the past, this technology will create a new standard — within budget — for agencies. Frameworks like Spark or Impala will benefit tremendously from hardware advances like 3D Xpoint, and as terabytes of memory become commonplace, so too will these computing frameworks and their associated analytics within the mission workflow.
Innovations like these embody Hadoop. The transformative power of data in the years to come will yield immense breadth and depth of influence within the agency, and Hadoop will play an important role in its propagation and scope.
Storage innovation, networking improvements, and the integration of special-purpose processing into the fabric of computer systems are just a few of the advances that will involve or affect Hadoop and consequently influence data within the agency. The moral of Mike’s post and this year’s forum is clear: Hadoop has made an enormous impact, yet it promises to continue its effect by reinventing itself time and again as it matures.
Develop your data strategy at the 2016 Cloudera Government Forum
Leaders meet in Washington, DC March 15 to learn how to develop a data strategy that evolves with business needs
By Webster Mudge
Technology is underpinning a far-reaching transformation within our government. At both the federal and state levels, agencies are creating, consuming, and demanding access to richer and more longitudinal data sources and analytics to accomplish their missions, and many government leaders are pursuing data strategies that securely and efficiently turn data into mission-critical assets.
With real-time data analytics, for example, defense agencies can connect soldiers to services that help to streamline communications and enhance ISR, help agencies monitor and execute supply chain logistics and deliver more timely and personalized medical care and support. Secure, effective use of data is at the core of President Obama’s Precision Medicine Initiative (PMI) to revolutionize healthcare and is the driver behind an Immigrations and Customs Enforcement program at the Department of Homeland Security to track and counter the proliferation of weapons into the ranks of our nation’s enemies. Many of these initiatives rely on a modern data management and analytics platform built to address the challenges of big data: Apache Hadoop.
This year’s Cloudera Government Forum coincides with an important milestone for this technology foundation — the tenth anniversary of Apache Hadoop. In the last decade, organizations both public and private have moved to Hadoop as a way to advance their IT strategies and infrastructures to grow their business and mission capabilities.
At its most basic form, Hadoop is a reliable, agile, and secure platform that gives businesses room to explore what was previously impossible or prohibitively difficult. Agencies and organizations starting out on their data journey, however, can draw upon a myriad of exciting new computing strategies forged from these fundamentals by big data pioneers and early movers.
Our goal with the Government Forum is to bring together leaders who can share real-world examples of how data and a data-driven mindset improve the decision-making process and illustrate the ways in which data has started and will continue the current trend of business transformation. For most agencies, this evolutionary journey has just begun, and Cloudera offers both guidance and insight to help agency leaders navigate these changes.
For those government leaders who are new to the technology and the data-driven culture, there are a number of valuable resources that can help define and articulate the strategic objectives, including an e-book on navigating the early stages of the data journey.
Any evaluation and strategy must include discussions regarding the policies and procedures for secure data management and analysis. With security and effective governance a top concern, agencies should be taking a five-pronged approach to protecting their data.
Developing data awareness and moving to a culture with data at its core takes new conversations, including concentrated efforts to educate stakeholders on advancing and evolving their data strategies. Cloudera’s Center of Excellence is a teaching framework that can help technology leaders scale “best practices” for data management across an organization. The framework shows leaders how to develop, deploy, operate, and publish, and how to repeat the processes so the agency can continue on a path to business evolution.
Whether an agency is new to data analytics for the public sector or continuing to advance its program, it is important to learn by example when it comes to data strategy. Cloudera’s white paper on the enterprise data hub for the public sector is an excellent starting point for those seeking more.
Most people are not technologists, which is why the learning process is such an important step. With Cloudera as a technology partner, agency leaders gain the experience and capability to address key management issues, foster adoption and promote culture change, and evolve data strategies to meet the challenges and objectives of today and tomorrow.
In its fifth year, the 2016 Cloudera Government Forum brings together public and private sector practitioners to discuss the state of data analytics and share case studies of how agencies are leveraging their data to enable improved decision-making. Spend a day discovering how your organization can evolve with transformative technologies powered by Hadoop and results-driven data management and analysis. Register now.