Data Analyst

Examines data from multiple disparate sources with the goal of providing security and privacy insight. Designs and implements custom algorithms, workflow processes, and layouts for complex, enterprise-scale data sets used for modeling, data mining, and research purposes.

Below are the Knowledge, Skills, Abilities and Tasks identified as being required to perform this work role.

Knowledge of computer networking concepts and protocols, and network security methodologies.
Knowledge of risk management processes (e.g., methods for assessing and mitigating risk).
K0003Knowledge of laws, regulations, policies, and ethics as they relate to cybersecurity and privacy.
K0004Knowledge of cybersecurity and privacy principles.
K0005Knowledge of cyber threats and vulnerabilities.
K0006Knowledge of specific operational impacts of cybersecurity lapses.
K0015Knowledge of computer algorithms.
K0016Knowledge of computer programming principles
K0020Knowledge of data administration and data standardization policies.
K0022Knowledge of data mining and data warehousing principles.
K0023Knowledge of database management systems, query languages, table relationships, and views.
K0025Knowledge of digital rights management.
K0031Knowledge of enterprise messaging systems and associated software.
K0051Knowledge of low-level computer languages (e.g., assembly languages).
K0052Knowledge of mathematics (e.g. logarithms, trigonometry, linear algebra, calculus, statistics, and operational analysis).
K0056Knowledge of network access, identity, and access management (e.g., public key infrastructure, Oauth, OpenID, SAML, SPML).
K0060Knowledge of operating systems.
K0065Knowledge of policy-based and risk adaptive access controls.
K0068Knowledge of programming language structures and logic.
K0069Knowledge of query languages such as SQL (structured query language).
K0083Knowledge of sources, characteristics, and uses of the organization??s data assets.
K0095Knowledge of the capabilities and functionality associated with various technologies for organizing and managing information (e.g., databases, bookmarking engines).
K0129Knowledge of command-line tools (e.g., mkdir, mv, ls, passwd, grep).
K0139Knowledge of interpreted and compiled computer languages.
K0140Knowledge of secure coding techniques.
K0193Knowledge of advanced data remediation security features in databases.
K0197Knowledge of database access application programming interfaces (e.g., Java Database Connectivity [JDBC]).
K0229Knowledge of applications that can log errors, exceptions, and application faults and logging.
K0236Knowledge of how to utilize Hadoop, Java, Python, SQL, Hive, and Pig to explore data.
K0238Knowledge of machine learning theory and principles.
K0325Knowledge of Information Theory (e.g., source coding, channel coding, algorithm complexity theory, and data compression).
K0420Knowledge of database theory.
S0013Skill in conducting queries and developing algorithms to analyze data structures.
S0017Skill in creating and utilizing mathematical or statistical models.
S0028Skill in developing data dictionaries.
S0029Skill in developing data models.
S0037Skill in generating queries and reports.
S0060Skill in writing code in a currently supported programming language (e.g., Java, C++).
S0088Skill in using binary analysis tools (e.g., Hexedit, command code xxd, hexdump).
S0089Skill in one-way hash functions (e.g., Secure Hash Algorithm [SHA], Message Digest Algorithm [MD5]).
S0094Skill in reading Hexadecimal data.
S0095Skill in identifying common encoding techniques (e.g., Exclusive Disjunction [XOR], American Standard Code for Information Interchange [ASCII], Unicode, Base64, Uuencode, Uniform Resource Locator [URL] encode).
S0103Skill in assessing the predictive power and subsequent generalizability of a model.
S0106Skill in data pre-processing (e.g., imputation, dimensionality reduction, normalization, transformation, extraction, filtering, smoothing).
S0109Skill in identifying hidden patterns or relationships.
S0113Skill in performing format conversions to create a standard representation of the data.
S0114Skill in performing sensitivity analysis.
S0118Skill in developing machine understandable semantic ontologies.
S0119Skill in Regression Analysis (e.g., Hierarchical Stepwise, Generalized Linear Model, Ordinary Least Squares, Tree-Based Methods, Logistic).
S0123Skill in transformation analytics (e.g., aggregation, enrichment, processing).
S0125Skill in using basic descriptive statistics and techniques (e.g., normality, model distribution, scatter plots).
S0126Skill in using data analysis tools (e.g., Excel, STATA SAS, SPSS).
S0127Skill in using data mapping tools.
S0129Skill in using outlier identification and removal techniques.
S0130Skill in writing scripts using R, Python, PIG, HIVE, SQL, etc.
S0160Skill in the use of design modeling (e.g., unified modeling language).
S0202Skill in data mining techniques (e.g., searching file systems) and analysis.
S0369Skill to identify sources, characteristics, and uses of the organization??s data assets.
A0029Ability to build complex data structures and high-level programming languages.
A0035Ability to dissect a problem and examine the interrelationships between data that may appear unrelated.
A0036Ability to identify basic common coding flaws at a high level.
A0041Ability to use data visualization tools (e.g., Flare, HighCharts, AmCharts, D3.js, Processing, Google Visualization API, Tableau, Raphael.js).
A0066Ability to accurately and completely source all data used in intelligence, assessment and/or planning products.
T0007Analyze and define data requirements and specifications.
T0008Analyze and plan for anticipated changes in data capacity requirements.
T0068Develop data standards, policies, and procedures.
T0146Manage the compilation, cataloging, caching, distribution, and retrieval of data.
T0195Provide a managed flow of relevant information (via web-based portals or other means) based on mission requirements.
T0210Provide recommendations on new database technologies and architectures.
T0342Analyze data sources to provide actionable recommendations.
T0347Assess the validity of source data and subsequent findings.
T0349Collect metrics and trending data.
T0351Conduct hypothesis testing using statistical processes.
T0353Confer with systems analysts, engineers, programmers, and others to design application.
T0361Develop and facilitate data-gathering methods.
T0366Develop strategic insights from large data sets.
T0381Present technical information to technical and nontechnical audiences.
T0382Present data in creative formats.
T0383Program custom algorithms.
T0385Provide actionable recommendations to critical stakeholders based on data analysis and findings.
T0392Utilize technical documentation or resources to implement a new mathematical, data science, or computer science method.
T0402Effectively allocate storage capacity in the design of data management systems.
T0403Read, interpret, write, modify, and execute simple scripts (e.g., Perl, VBScript) on Windows and UNIX systems (e.g., those that perform tasks such as: parsing large data files, automating manual tasks, and fetching/processing remote data).
T0404Utilize different programming languages to write code, open files, read files, and write output to different files.
T0405Utilize open source language such as R and apply quantitative techniques (e.g., descriptive and inferential statistics, sampling, experimental design, parametric and non-parametric tests of difference, ordinary least squares regression, general line).
T0460Develop and implement data mining and data warehousing programs.