Skip to content

Commit

Permalink
Update index.adoc (#1078)
Browse files Browse the repository at this point in the history
* Update index.adoc

* Update index.adoc

* Update index.adoc

* Update index.adoc

* Update index.adoc
  • Loading branch information
kqlacy authored Nov 13, 2024
1 parent e242f3f commit a9b55ad
Showing 1 changed file with 245 additions and 77 deletions.
322 changes: 245 additions & 77 deletions tools-appendix/modules/ROOT/pages/index.adoc
Original file line number Diff line number Diff line change
Expand Up @@ -5,81 +5,249 @@ These tools may help you deal with data. Here you will find a brief synopsis on

== Table of Contents

=== Data Science
* xref:starter-guides:data-science:data-modeling/index.adoc[The Data Modeling Process, including general principles, a step by step guide, and how to choose a data modeling technique]
* xref:starter-guides:data-science:data-analysis/introduction-data-analysis-techniques.adoc[Data Analysis (techniques to help you make sense of data, such as Time Series, NLP, Neural Networks, Computer Vision, etc)]
* Data Visualization, such as with Tableau, PowerBI, Python or R
* xref:starter-guides:data-science:gather-data/free-data-sets.adoc[Gathering Data, such as using web scraping and lists of free data sources]

=== Data Engineering
* xref:starter-guides:data-engineering:containers/intro-to-containers.adoc[Containers (such as with Kubernetes, PySpark)]
* xref:starter-guides:data-engineering:databases/introduction-databases.adoc[Databases/SQL (such as with SQLite)]
* xref:starter-guides:data-engineering:slurm/introduction-slurm.adoc[SLURM]

=== Programming Languages For Data Professionals
* Python
* R
* SQL
* Perl

=== Tools, Standards & More
* xref:starter-guides:tools-and-standards:jupyter.adoc[Jupyter Notebook/Lab]
* xref:starter-guides:tools-and-standards:data-science-ethics.adoc[Data Ethics]
* xref:tools-and-standards:bookshelf.adoc[The Bookshelf]
* xref:starter-guides:tools-and-standards:data-formats/introduction-data-formats.adoc[Data Formats]
* xref:starter-guides:tools-and-standards:matlab/introduction-matlab.adoc[Matlab]
* xref:starter-guides:tools-and-standards:git/introduction-git.adoc[Git]
* xref:starter-guides:tools-and-standards:unix/introduction-unix.adoc[Unix]
* xref:Tools:PowerBI-in-Teams-Instructions.adoc[PowerBI]


=== Anvil
* xref:anvil:index.adoc[Anvil]
* xref:anvil:access-setup.adoc[User Account (ACCESS) Setup]
* xref:GitHub:github-anvil.adoc[GitHub on Anvil]
* xref:anvil:anvil-getting-started.adoc[Getting Started With Anvil]
* xref:anvil:uploading-data.adoc[Uploading Data To Anvil]
* xref:anvil:anvil-windows-vm.adoc[Setting Up Windows VM]
* xref:GitHub:git-cli.adoc[Pushing Code Using Git on Anvil]

== How to Use

The Starter Guides are meant to be use-it-when-you-need-it, so if you know what topic you are looking for, dig right in! Otherwise, this page can help you get started.

If you are brand new to dealing with data, start with xref:starter-guides:data-science:data-modeling/index.adoc[learning about the data modeling process]. If you need to gather your own dataset, xref:starter-guides:data-science:gather-data/free-data-sets.adoc[web scraping or searching for a dataset is the next step]. Once you have data, you might need to xref:starter-guides:data-science:data-modeling/choosing-model/index.adoc[select and perform an analysis technique].

For projects with a data engineering focus, check out our xref:starter-guides:data-engineering:containers/intro-to-containers.adoc[Containers], xref:starter-guides:data-engineering:databases/introduction-databases.adoc[SQL], or xref:starter-guides:data-engineering:slurm/introduction-slurm.adoc[SLURM] guides.

=== Data Engineering Vs. Data Science: What's the Difference?

If you are relatively new to dealing with data, refer to the table below to get a feel for the difference between data engineering and data science.

[cols="3,3,3"]
|===
|Discipline |Data Engineering | Data Science

|Languages Used
|Any; General Purpose Languages Most Common, like Python, Java, C++
| Python, R

|What They Do With Data
|"Move Data Around"; Collect, Organize, Set Up Databases; Set up Cloud Systems
| "Make Sense of Data"; Analyze, Train Models, Make Visualizations

|Common Backgrounds
|Computer Science, IT
|Math, Statistics, Computer Science

|Common Tools
| Hadoop, NoSQL, Spark, Postgresql, Kubernetes, Docker
| MapReduce, Keras, PyTorch, Plotting Packages like GGPlot, JAX

|===

WARNING: While there are debates about whether data science is data engineering and vice versa, or whether they even belong on the same guide together, they are both dependent on each other in some form or fashion, and so we included both as separate categories. For some organizations, people do both! For others, they have multiple departments that share all of those responsibilities; still others draw a much starker line between the two than we have here. One thing is clear: dealing with data always depends on who you deal with, and the jury is still out on the right way to categorize these skills. Nonetheless they overlap in many areas.

TIP: Data professionals of all stripes should know a mix of data engineering and data science to be successful at their jobs.

=== I Don't See The Topic I Am Looking For

Try the search bar in the top right corner of the Examples Book; it searches across our entire site.
=== xref:anvil:index.adoc[Anvil]
* xref:anvil:anvil-setup-roadmap.adoc[Anvil Setup Roadmap]
* xref:anvil:access-helpful-links.adoc[Helpful ACCESS Links]
* xref:anvil:anvil-getting-started.adoc[Getting Started with Anvil]
* xref:anvil:vscode.adoc[VSCode]
* xref:anvil:jupyter.adoc[Jupyter Lab]
* xref:anvil:gpu.adoc[GPU]

=== xref:git:introduction-git.adoc[GitHub]
* xref:git:git-cli.adoc[GitHub CLI]
* xref:git:github-desktop.adoc[GitHub Desktop]
* xref:git:github-anvil.adoc[GitHub on Anvil]
* xref:git:other-setup.adoc[Other Setup]
* xref:git:terminology.adoc[Terminology]
* xref:git:workflows.adoc[Workflows]

=== xref:matlab:introduction-matlab.adoc[Matlab]
* xref:matlab:training.adoc[Training]

=== xref:perl:index.adoc[Perl]
* xref:perl:perl-books.adoc[Perl Books]

=== xref:powerbi:index.adoc[PowerBI]
* xref:powerbi:PowerBI-in-Teams-Instructions.adoc[PowerBI in Teams]

=== xref:python:index.adoc[Python]
* xref:python:python-starter-skills-roadmap.adoc[Python Starter Skills Roadmap]
* xref:python:indentation.adoc[Indentation]
* xref:python:variables.adoc[Variables]
* xref:python:printing-and-f-strings.adoc[Printing and F-Strings]
* xref:python:logical-operators.adoc[Logical Operators]
* xref:python:tuples.adoc[Tuples]
* xref:python:lists.adoc[Lists]
* xref:python:dictionaries.adoc[Dictionaries]
* xref:python:sets.adoc[Sets]
* xref:python:control-flow.adoc[Control Flow]
* xref:python:writing-functions.adoc[Writing Functions]
* xref:python:classes.adoc[Classes]

.xref:python:writing-scripts.adoc[Writing Scripts]
[%collapsible]
====
** xref:python:argparse.adoc[argparse]
====

.xref:python:pandas-intro.adoc[pandas]
[%collapsible]
====
** xref:python:pandas-read-write-data.adoc[Reading & Writing Data]
** xref:python:pandas-series.adoc[Series]
** xref:python:pandas-dataframes.adoc[DataFrames]
** xref:python:pandas-indexing.adoc[Indexing]
** xref:python:pandas-dates-and-times.adoc[Dates and Times]
** xref:python:pandas-aggregate-functions.adoc[Aggregate Functions]
** xref:python:pandas-reshaping.adoc[Reshaping]
====

.xref:python:python-scraping.adoc[Scraping]
[%collapsible]
====
** xref:python:requests.adoc[Requests]
** xref:python:lxml.adoc[lxml]
** xref:python:selenium.adoc[Selenium]
** xref:python:web-scraping-anvil.adoc[Running on Anvil]
====

.xref:python:plotting.adoc[Plotting]
[%collapsible]
====
** xref:python:matplotlib.adoc[Matplotlib]
** xref:python:plotly-examples.adoc[Plotly]
====

.xref:python:documentation.adoc[Documentation]
[%collapsible]
====
** xref:python:docstrings-and-comments.adoc[Docstrings & Comments]
** xref:python:pdoc.adoc[pdoc]
** xref:python:sphinx.adoc[Sphinx]
====

.xref:python:testing.adoc[Testing]
[%collapsible]
====
** xref:python:pytest.adoc[pytest]
** xref:python:mypy.adoc[mypy]
====

.xref:python:serialization-and-deserialization.adoc[Serialization & Deserialization]
[%collapsible]
====
** xref:python:messagepack.adoc[MessagePack]
====
* xref:python:dask.adoc[Dask]
* xref:python:jax.adoc[JAX]

.xref:python:python-package-management.adoc[Package Management]
[%collapsible]
====
** xref:python:package-management-fundamentals.adoc[Package Management Fundametals]
** xref:python:pypi.adoc[PyPi]
** xref:python:pip.adoc[Pip]
** xref:python:virtualenv.adoc[Virtualenv]
** xref:python:pipenv.adoc[Pipenv]
** xref:python:poetry.adoc[Poetry]
** xref:python:anaconda.adoc[Anaconda]
====
* https://codingbat.com/python[Python Coding Examples (Coding Bat)]
* https://docs.python.org/3/[Python Official Documentation]

=== xref:r:index.adoc[R]
* xref:r:variables.adoc[Variables]
* xref:r:logical-operators.adoc[Logical Operators]
* xref:r:lists-and-vectors.adoc[Lists and Vectors]
* xref:r:data-frames.adoc[data.frames]
* xref:r:reading-and-writing-data.adoc[Reading and Writing Data]
* xref:r:control-flow.adoc[Control Flow]
* xref:r:writing-functions.adoc[Writing Functions]

.xref:r:r-base-functions.adoc[R Base Functions]
[%collapsible]
====
** xref:r:ncol.adoc[ncol]
** xref:r:nrow.adoc[nrow]
** xref:r:dim.adoc[dim]
** xref:r:str.adoc[str]
** xref:r:head.adoc[head]
** xref:r:tail.adoc[tail]
** xref:r:unique.adoc[unique]
** xref:r:mean.adoc[mean]
** xref:r:median.adoc[median]
** xref:r:var.adoc[var]
** xref:r:sd.adoc[sd]
** xref:r:abs.adoc[abs]
** xref:r:sum.adoc[sum]
** xref:r:min.adoc[min]
** xref:r:max.adoc[max]
** xref:r:length.adoc[length]
** xref:r:table-and-prop-table.adoc[table & prop.table]
** xref:r:rep.adoc[rep]
** xref:r:seq.adoc[seq]
** xref:r:which.adoc[which]
** xref:r:r-grep.adoc[grep]
** xref:r:sort.adoc[sort]
** xref:r:order.adoc[order]
** xref:r:paste-and-paste0.adoc[paste & paste0]
** xref:r:cut.adoc[cut]
** xref:r:split.adoc[split]
** xref:r:subset.adoc[subset]
** xref:r:merge.adoc[merge]
====
* xref:r:apply-functions.adoc[Apply Functions]

.xref:r:plotting.adoc[Plotting]
[%collapsible]
====
** xref:r:r-base-plotting.adoc[R `graphics` plotting]
*** xref:r:barplot.adoc[barplot]
** xref:r:ggplot2.adoc[`ggplot2`]
*** xref:r:geom_point.adoc[geom_point]
====

.xref:r:tidyverse.adoc[Tidyverse]
[%collapsible]
====
** xref:r:piping.adoc[Piping]
** xref:r:select.adoc[select]
** xref:r:transmute.adoc[transmute]
** xref:r:mutate.adoc[mutate]
** xref:r:case_when.adoc[case_when]
** xref:r:between.adoc[between]
** xref:r:glimpse.adoc[glimpse]
** xref:r:filter.adoc[filter]
** xref:r:arrange.adoc[arrange]
** xref:r:group_by.adoc[group_by]
** xref:r:summarize.adoc[summarize]
** xref:r:str-extract-all.adoc[str_extract and str_extract_all]
** xref:r:lubridate.adoc[lubridate]
** xref:r:strrep.adoc[strrep]
** xref:r:nchar.adoc[nchar]
====
* xref:r:data-table.adoc[data.table]
* xref:r:sql-in-r.adoc[SQL in R]
* xref:r:r-scraping.adoc[Scraping]
* xref:r:shiny.adoc[Shiny]
* https://www.r-bloggers.com/[R Bloggers - Resource for Variety of R Topics]

=== xref:sql:index.adoc[SQL]
* xref:sql:sql-books.adoc[SQL books]
* xref:sql:terminology.adoc[Terminology]

.xref:sql:queries.adoc[Queries]
[%collapsible]
====
** xref:sql:baseball-examples.adoc[SQL Baseball examples]
** xref:sql:chinook-examples.adoc[SQL Chinook examples]
====

* xref:sql:aliasing.adoc[Aliasing]
* xref:sql:aggregate-functions.adoc[Aggregate functions]
* xref:sql:joins.adoc[Joins]

=== xref:unix:introduction-unix.adoc[UNIX]

.xref:unix:standard-utilities.adoc[Standard Utilities]
[%collapsible]
====
** xref:unix:man.adoc[man]
** xref:unix:pwd.adoc[pwd]
** xref:unix:ls.adoc[ls]
** xref:unix:cd.adoc[cd]
** xref:unix:cat.adoc[cat]
** xref:unix:head.adoc[head]
** xref:unix:tail.adoc[tail]
** xref:unix:touch.adoc[touch]
** xref:unix:cp.adoc[cp]
** xref:unix:rm.adoc[rm]
** xref:unix:rmdir.adoc[rmdir]
** xref:unix:which.adoc[which]
** xref:unix:type.adoc[type]
** xref:unix:wc.adoc[wc]
** xref:unix:cut.adoc[cut]
** xref:unix:uniq.adoc[uniq]
** xref:unix:find.adoc[find]
** xref:unix:tr.adoc[tr]
** xref:unix:grep.adoc[grep]
** xref:unix:ssh.adoc[ssh]
====

.xref:unix:text-editors.adoc[Text Editors]
[%collapsible]
====
** xref:unix:vim.adoc[vim]
** xref:unix:emacs.adoc[emacs]
** xref:unix:nano.adoc[nano]
====

.xref:unix:other-topics.adoc[Other Topics]
[%collapsible]
====
** xref:unix:permissions.adoc[Permissions]
** xref:unix:special-symbols.adoc[~ & . & ..]
** xref:unix:piping.adoc[Piping]
** xref:unix:scripts.adoc[Scripts]
====

0 comments on commit a9b55ad

Please sign in to comment.