Академический Документы
Профессиональный Документы
Культура Документы
Self-service data prep may sound easy, intuitive, and exactly what you need for
that daunting data transformation and preparation project. Yet, given the vast
differences among products and solutions available today, it is easy to get lost.
This guide offers ten real business scenarios that will elicit your attention, help
you cut out the noise, and focus on what truly fits your needs and requirements.
In each scenario, we provide real life examples and use cases so that you can
envision your own environment in the description.
1 IS REAL TIME BUSINESS SCENARIOS
EXPLORATION OF DATA AN
IMPORTANT PART OF
YOUR DATA PREPARATION? Data lakes that were populated
with no attention to data quality
Not all data prep solutions are created equal. Some
follow a workflow style where predetermined rules
are applied to prepare data for analysis. This requires
knowing the questions to ask of the data beforehand. Data with high potential of anomaly e.g. weight
For example, in a supply chain scenario, you should that could be recorded in ounces, pounds, grams,
prescribe the tool to replace any supplier location kilograms, and more
paxata.com 3
2 DO YOU HAVE A LOT OF
UNKNOWN DATASETS,
SUCH AS THIRD-PARTY
DATA AND FORM-FILLS?
BUSINESS SCENARIOS
When data is sourced from in-house systems of records,
the user typically possesses some level of knowledge
about it. However, very little is known about data that
comes from external sources. For example, onboarding
new client or supplier data varies in type and complexity. Onboarding and blending second- and
third-party data with first-party data
The same situation occurs when data comes from
form fields such as those in survey software, marketing
automation, logistics and scheduling apps, ERP, and
Curating external sources of
others. Data in these situations typically have a wide
data to create a data product
variety of misspellings.
paxata.com 4
3 WOULD INACCURATE
DATA JEOPARDIZE
YOUR REPUTATION OR
REVENUE?
Some data preparation tools often limit the user to a
small sample of data. In this case, one is left to hope that
all of the possible anomalies and outliers are included
in the small, allocated sample of data. While this is a BUSINESS SCENARIOS
paxata.com 5
4 IS GOVERNANCE A KEY
PIECE OF YOUR DATA
PREPARATION AND
REPORTING?
In many cases, a data prep project leads to downstream
reporting and analytics that are used across executive BUSINESS SCENARIOS
The key is to provide end-to-end traceability of data. For Evidence-based medicine in healthcare
example, in cases where the information from the data
preparation ends up in downstream systems such as
business intelligence applications or public portals, it is
Compliance reporting, such as anti-money laundering
vital to show a full lineage.
paxata.com 6
5 IS VERSIONING AND
HAVING A SNAPSHOT
OF YOUR DATA PREP
PROJECTS AND DATASETS BUSINESS SCENARIOS
CRITICAL?
In some cases, data prep projects are a one-time
event. However, you will certainly accumulate new data,
Regulatory and audit-heavy programs
expand information presented in a business dashboard,
extend source systems, or need to record snapshots of
your prepared data for point-in-time analysis.
Data quality monitoring
A basic example is when one set of data is prepared
for bookings before it is augmented with another
dataset to conduct bookings versus revenue analysis.
In this scenario, the user may want to store and version Single source to multi-source
“bookings” before blending it with other data sets and progression of data blending
paxata.com 7
6 DO YOU HAVE A HIGH
RATIO OF BUSINESS SMES
TO DATA ENGINEERS?
Historically, data preparation has been within the
BUSINESS SCENARIOS
purview of IT teams. This is partly a legacy issue, as the
tools and techniques that were created in the 1990s
were created for developers with technical skills. Today,
this paradigm is shifting.
Weekly sales reporting
The reality is that for every data engineer or technical
resource in a given organization, there are potentially
1000+ business analysts, all of whom want more
Supply chain / inventory management
information to perform their jobs better. Therefore,
treating data preparation as an IT task inherently creates
a bottleneck.
IoT device usage data analysis
The context of the data remains within the line of
business. For instance, a supply chain manager would
be aware that “Govis Pharmaceuticals” and “Novis
Pharmaceuticals” are in fact the same vendor, and the Call center data prep
different entries are purely recording mistakes. IT would
not have knowledge of that context.
paxata.com 8
7 DO YOU NEED TO CREATE
GOVERNED BUSINESS
USER SANDBOXES FOR
YOUR DATA LAKE? BUSINESS SCENARIOS
In addition to scenarios where a business team wants
to take a hands-on approach to its data preparation
as described in section 6, there are scenarios where IT
desires to provide a governed or contained environment
Data lake exploration
for its business users.
paxata.com 9
8 ARE YOU OPERATING IN
A MULTI-REGION, MULTI-
DEPARTMENT, OR MULTI-
CLIENT ENVIRONMENT? BUSINESS SCENARIOS
A basic data prep tool cannot serve these many layers of Consultants and OEMs who provide data prep services
to their customers accessing all of their customer
complexity of authentication and authorization, while an
tenants using a single ID and password
enterprise one can.
paxata.com 10
9 DO YOU HAVE A
MULTI-CLOUD STRATEGY?
Many companies today are avoiding the vendor lock-in
situations that occurred a couple of decades ago with
the titans of the enterprise software industry. These
companies are considering a hybrid environment –
BUSINESS SCENARIOS
using a mixture of various cloud environments and
in-house systems.
paxata.com 11
10 DO YOU HAVE A VARIETY
OF STAKEHOLDERS WHO
WANT TO PARTICIPATE
IN DATA PREPARATION
PROJECTS? BUSINESS SCENARIOS
paxata.com 12
CLOSING THOUGHTS
Choosing the right data prep solution is not easy. While on the surface everything
sounds the same, you now know that not all tools are created equal. You need to
bear in mind your business scenarios, types of users, and internal and external
requirements before selecting the right data prep solution that is tailored to your
organization’s specific needs.
13
Companies around the globe rely on Paxata to get smart about information. Paxata is
the pioneer that intelligently empowers all business consumers to transform raw data
into ready information, instantly and automatically, with an enterprise-grade, self-service
data preparation application and machine learning platform. Our Adaptive Information
Platform weaves data into an information fabric from any source and any cloud to create
trusted insights. Business consumers use clicks, not code to achieve results in minutes,
not months. With Paxata, Be an Information Inspired Business.
Paxata Headquarters 1800 Seaport Boulevard Redwood City, CA 94063 1-855-9-PAXATA paxata.com