Java Technologies Latest Version: Java SE 17, Java EE 8 Spring/spring boot? Microservices? |
Core Java, J2EE, Servlets,
JSP, EJB, JMS, JNDI, JDBC, JTS Major Java frameworks: Spring, Apache Struts,
Hibernate, Grails, JSF (JavaServer Faces), Wicket, GWT Design Pattern: Singleton, DAO, DTO, MVC,
Front Controller, Factory Method Service-oriented
architecture/web services: (SOAP/REST) IDE: Netbeans, Eclipse,
Myeclipse, IntelliJ IDEA Servers: Apache Tomcat, Glassfish
Server, JBoss Server , Weblogic Server Web Technology: HTML, CSS, Javascript,
JQuery, AJAX Advanced JavaScript
framework - Angular, React, VueJS,
Node Design architectures: MVC, MVVM, MVVM-C, MVC-C, MVP |
.Net Technologies Latest Version: .NET Framework 4.8 .NET Core 5 |
VB.NET, ASP.NET, ADO.NET, VB.net, VC++.NET, C#, MVC, COM, DCOM, Visual
Studio ASP.NET is used as backend and C# & VB.NET are used for frontend
development. |
Front- end |
Html, CSS, JavaScripts (Angular, React, Vue), JQuery, Ajax, C#, VB.NET |
Back-end |
C/C++, C#, PHP, Python,
Ruby, Java, Sql, Perl JAVA (backend): J2EE,
JSF, JSP, Webservices (Rest, SOAP UI), Struts, Spring, hibernate, servlets,
Spring boot, Microservices, REST API, Express, Django, Rails, Laravel |
CMS (Content management system) |
Wordpress, Joomla, Drupal,
Magento, Blogger, Shopify, Bitrix, Typo3, Squarespace, Pretashop, Dotnetnuke |
Front-end Javascript |
Angular, React, Vue, Ember, Polymer, backbone, Aurelia, Mithril, Webix |
Back-end Javascript |
Node.JS, Express.JS, Hapi, Feather.js, Meteor.js, Total.js, Next Node. js can be used in both the frontend and backend of applications. |
JS testing
Framework: |
Jest, Mocha JS,
Jasmine, Cypress |
Databases: |
SQL (Structured Query Language)/Relational Databases (RDBMS): MySQL, PostgreSQL, Oracle 12c, Sybase, MS SQL server (Microsoft SQL
Server), Access, Ingres. NoSQL Databases: MongoDB, Cassandra, Redis,
HBase, Neo4j, Oracle NoSQL, DynamoDB, Couchbase, Memcached, CouchDB |
DevOps |
Version Control tools: Git, SVN, Mercurial, CVS, and JIRA. Build tools: Ant, Maven, Gradle Automation Testing: Selenium, TestNG, JUnit Continuous Integration: Jenkins Containers: Docker, Kubernetes Monitoring: Splunk, ELK Stack, Nagios, NewRelic and Sensu |
Cloud |
Cloud - AWS, Azure, GCP IaaS (Infrastructure-as-a-Service), PaaS (Platform-as-a-Service), SaaS
(Software-as-a-Service) |
AWS Services |
EC2, RDS, S3, Lambda, Cloud front, EBS, EFS, Athena, Cloudsearch,
Elasticsearch, DynamoDB |
Azure cloud
services |
Azure Active Directory (AD),
Azure Content Delivery Network (CDN), Azure Data Factory Azure Machine Learning,
Azure HDInsight Hadoop, Azure Data Lake Analytics, Azure SQL database, Azure
Function, CosmosDB, Azure DevOps, Azure Backup, Logic Apps, Virtual Machine Azure Data Lake can be understood as a huge repository of data in its original form
for Big data analytics. data
storage and analytics service Output data from Azure Data
Factory can be published on Azure Data Lake for Business Analytics (BI)
applications for analytics and visualization. Azure Data Bricks: Databricks helps
in data streaming and data collaboration in real-time, integrated with ADF,
ML, Synapse, Power BI. Azure data Factory: Azure's cloud ETL service |
Data warehouses |
Data lakes and data
warehouses are both widely used for storing big data. Popular cloud
data warehouses like AWS
DynamoDB, AWS Redshift |
ETL (Extract, Transform, Load) |
Informatica PowerCenter,
Talend, IBM InfoSphere DataStage, Pentaho, AWS Glue, Azure
Data Factory |
Data Engineer/ Big Data Engineer: |
Raw data gather, Data flow, Pipelines, ETL Tools: Python, SQL, PostgreSQL, MongoDB, Apache Spark, Apache Kafka,
Amazon Redshift, Snowflake, Amazon Athena, Apache Airflow |
Data Scientist/Data Analyst: |
Data cleaning, Analytics, Metrics |
Big Data/Hadoop |
HDFS, Map Reduce, Pig, Hive, Sqoop, Oozie, Scala, Spark, Kafka, Flume, Ambari, Hue Hadoop Ecosystem: store and process data, HDFS, YARN, Scala, Map Reduce,
Hive, Pig, Zookeeper, Sqoop, Oozie, Bedrock, Flume, Kafka, Impala, NiFi,
MongoDB, HBase. Hadoop Platforms: Hortonworks, Cloudera,
Azure, Amazon Web services (AWS). Hadoop Frameworks: Cloudera CDHs, Horton works HDPs |
BI Business Intelligence |
Tableau, PowerBI, Qlikview |
Big data |
Huge data |
Hadoop |
framework used for storing and processing big data |
Kafka |
used to build real-time streaming data pipelines/ stream processing |
Spark |
large-scale data processing, written in Scala |
Scala |
Scala is used in Data processing, distributed computing, and web
development. written in java |
Snowflake |
It is also the only cloud data platform that can be used as a data
warehouse and a data lake |
SDLC stages |
SDLC is a systematic process for building softwares 1) Requirement gathering and analysis, 2) Feasibility study, 3)
Design, 4) Coding, 5) Testing, 6) Installation/Deployment and 7) Maintenance
Bug fixing, upgrade. |
SDLC Methods/Model |
Waterfall Model, Agile Development Scrum, Incremental, V model, R.A.D
(Rapid Application Development Spiral, Big Bang |
Agile methodologies / Scrum |
1. Extreme Programming (XP), 2. Feature-Driven development (FDD), 3.
Adaptive system development (ASD), 4. Dynamic Systems Development Method
(DSDM), 5. Lean Software Development (LSD), 6. Kanban, 7. Crystal Clear, 8.
Scrum |
Java:
Spring =
application development framework.
Spring Boot = Spring Boot is a module of Spring Framework, used to develop REST
APIs & for building microservices.
REST API = Microservice
Microservices: = collection of smaller independent units
monolithic applications - built as a single unit n independent applications,
While a monolithic application is a single unified unit, a microservices
architecture breaks it down into a collection of smaller independent units.
These units carry out every application process as a separate service.
Swagger= documenting API.
OOPs concepts
(Object Oriented Programming):
The four basics of OOP are abstraction, encapsulation, inheritance, and
polymorphism.
Collection API
and Stream API: Java
Collections framework is used for storing and manipulating group of data.
Stream API is only used for processing group of data, STream API is a part of
Java 8.
Version
control, also known as
source control, is the practice of tracking and managing changes to software
code.
Messaging
services: (Active MQ,
RabbitMQ, or Kafka).
Mobile Apps development:
Android
development: Java, Kotlin,
GIT, XML, SQL, SDK, API, user interface, Android Studio. Kotlin: Android
development application; Android Studio: is the official IDE for Android
development. SDK (software development toolkit) is a set of software tools and
programs provided by hardware and software vendors that developers can use to
build applications for specific platforms.
iOS
development: (Swift,
Objective-C, XCode, etc.)
DevOps tools:
Containers are
a form of operating system virtualization, deploy an application.
Continuous
Development: Planning &
Coding
The code can be
written in any language, but it is maintained by using Version Control tools: Git, SVN,
Mercurial, CVS, and JIRA. tools like Ant, Maven, Gradle can be used
in this phase for
building/ packaging the code into an executable file
Continuous
Testing: Selenium, TestNG,
JUnit,
Docker Containers
can be used for simulating the test environment.
Continuous
Integration tool called Jenkins
build this code
using ant or maven.
Continuous
Deployment:
This is the stage
where the code is deployed to the production servers.
Puppet, Chef,
SaltStack, and Ansible
Containerization
tools also play an equally important role in the deployment stage. Docker and
Vagrant
Continuous
Monitoring –
The popular tools
used for this are Splunk, ELK Stack, Nagios, NewRelic and Sensu. These tools
help you monitor the application’s performance and the servers
DevOps tools:
1. Gradle
2. Git
3. Jenkins
4. Bamboo
5. Docker
6. Kubernetes
7. Puppet
8. Ansible
9. Nagios
10. Raygun
Data analyst: Excel, Sql, Tableau, Qliksense, Data visualization, Dashboard,
Reporting, R - statistical computing and graphics, data mining, use to clean,
analyze, and graph your data, Data mining is the process of understanding data
through cleaning raw data, finding patterns, creating models, and testing those
models.
Power BI: Data Visulization/Dashboard/Reporting.
Data Engineer: Develop, construct, test and maintain architectures, 1. Cloud
infrastructure, (AWS, Azure, Google cloud), 2. Big data technologies (
HDFS,hive, sqoop, pig, Hadoop, spark ), 3. Data warehousing and various job
schedulers, 4. Etl tool (informatica, talend), 5. Data streaming framework
(Kafka, Spark streaming etc), 6. Python, Java, R, Scala; Pandas and numpy
library in Python, 7. Database and sql (Oracle, greenplum, teradata), Data
visualization tools(tableau, qlikview, spotfire), Data quality toools or
framework, Unix shell scripting, NoSQL db's and ElasticSearch, Various machine
learning algorithms used to solve regression, classification, clustering
problems.
Skills: Python, Java, R, Scala, Pandas and numpy,
Database and sql (Oracle, nosql, teradata), Etl tool (informatica, talend), Big
data technologies ( HDFS,hive, sqoop, pig, Hadoop, spark ), Hadoop, Spark,
Scala, Kafka, NumPy, Pandas - Python libraries - used in data analysis.
·
Etl tool
(informatica, talend)
·
Database
and sql (Oracle, greenplum, teradata)
·
Big data
technologies ( HDFS,hive, sqoop, pig, Hadoop, spark )
·
Cloud
infracture (AWS, Azure, Google cloud)
·
Python,
Java, R, Scala
·
Pandas
and numpy
·
Data
visualization tools(tableau, qlikview, spotfire)
·
Data
warehousing and various job schedulers
·
Data
quality toools or framework
·
Unix
shell scripting
·
NoSQL
db's and ElasticSearch
·
Data
streaming framework (Kafka, NiFi, Spark streaming etc)
·
Various
machine learning algorithms used to solve regression, classification, clustering
problems
Big
Data: Big Data
meaning a data that is huge in size, collection of data that is huge in size
and yet growing exponentially with time. Big Data analytics examples includes
stock exchanges, social media sites, jet engines, etc.
Apache
Hadoop is used to efficiently store and process large datasets.
Big Data/Hadoop: Hadoop is an open source, Java based framework used for storing and processing big data.
Kafka: Kafka is primarily used to build real-time streaming data
pipelines, real-time analysis as well as to process real-time streams to
collect Big Data. streaming analytics, data integration, and mission-critical
applications.
Spark: Apache Spark is used for large-scale data processing.
Scala: Scala is used in Data processing,
distributed computing, and web development.
Hadoop Ecosystem:
Some of the most well-known tools of
the Hadoop ecosystem include HDFS, Hive, Pig, YARN, MapReduce, Spark, HBase,
Oozie, Sqoop, Zookeeper.
Data Scientist: Analytics and Modeling, Machine Learning, Data Visualization,
Predictive analysis
Skills: Python, SQL, Knime, RapidMiner, SAS, Apache Spark, DataRobot,
BigML, Go Spot Check, Mozenda, MATLAB
SAS, Apache Spark
·
RapidMiner
·
Apache
Spark
·
MySQL
·
DataRobot
·
BigML
·
Go Spot
Check
·
Mozenda
·
MATLAB
·
Paxata
Azure
cloud services
Azure Data
Lake:
Azure Data Lake
includes all the capabilities required to make it easy for developers, data
scientists and analysts to store
data of any size, shape and speed, and do all types of processing and analytics
across platforms and languages.
Store and analyse
petabyte-size files and trillions of objects. Azure Data Lake is a scalable data storage and analytics
service. The service is hosted in Azure.
Azure
Databricks:
Azure Databricks
is optimized for Azure and tightly integrated with Azure Data Lake Storage, Azure Data Factory, Azure
Machine Learning, Azure Synapse Analytics, Power BI and other Azure services
to store all of your data on a simple, open lakehouse and unify all of your analytics and
AI workloads.
Azure Databricks
is a fully-managed platform service offering by Microsoft Azure, in a nutshell,
it is a Big Data and Machine Learning platform.
Databricks
provides a unified, open platform for all your data. It empowers data
scientists, data engineers, and data analysts with a simple collaborative
environment to run interactive, and scheduled data analysis workloads.
Azure data
Factory:
Azure Data
Factory is Azure's cloud
ETL service for scale-out serverless data integration and data
transformation.
ADF is generally
used for data movement,
ETL process, and data orchestration whereas; Databricks helps in data streaming and data collaboration
in real-time.
Amazon Web Services:
EC2, RDS, S3, Lambda, Cloud front, EBS, EFS, Athena, Cloudsearch,
Elasticsearch, DynamoDB.
EC2 – Compute, Storage service: S3, EBS, Container service: ECS, Storage
service: S3, Ec2, smt, iam
Compute
Services: EC2
AWS Storage
Services: Amazon Simple
Storage System (S3), Amazon Elastic Book Store (EBS)
Amazon Elastic
File System (EFS)
AWS Container
Services: Amazon Elastic
Container Service (ECS), Amazon ECS Anywhere, Amazon Elastic Kubernetes
Service, Amazon EKS Distro
AWS Analytics
Services: Amazon Athena,
Amazon Cloudsearch, Amazon Elasticsearch Service, Amazon EMR, Amazon FinSpace,
Amazon Kinesis
AWS Database
Services: Amazon Aurora,
Amazon DynamoDB, Amazon DocumentDB, Amazon ElastiCache, Amazon Keyspaces,
Amazon Neptune, Amazon Quantum Ledger Database, Amazon RDS, Amazon RDS on
VMware, Amazon Redshift
AWS Security,
Identity, and Compliance Services: AWS Identity & Access Management, Amazon Cognito, Amazon Detective,
Amazon GuardDuty, Amazon Inspector
AWS Serverless
Services: AWS Lambda, Amazon
API Gateway, Amazon DynamoDB, Amazon EventBridge, Amazon Simple Notification
System (SNS), Amazon Simple Queue Service (SQS)
ETL: Use ETL when you want to physically move data from multiple data
sources into a single data warehouse. Talend, Informatica, Data from one or
more sources is extracted and then copied to the data warehouse. The process of
ETL plays a key role in data integration strategies. It is the process of
moving raw data from one or more sources into a destination data warehouse. ETL
allows businesses to gather data from multiple sources and consolidate it into
a centralized location.
TIBCO: leading data integration solutions: Extract, transform, and load
(ETL) and data virtualization.
Mulesoft: data integration, ESB: An Enterprise Service Bus (ESB)
is fundamentally an architecture. It is a set of rules and principles for
integrating numerous applications together over a bus-like infrastructure.
Programming
Languages |
C++,
Java, J2EE/JEE, SQL/PLSQL |
Operating
Systems |
Windows
98/2000/XP/NT, Unix, MS-DOS, Linux |
Java
Technologies |
J2EE,
JSP, Servlets, JDBC, JMS, MDB, JNDI, Web Services, JSF. |
Web/App. Server |
Apache
Tomcat 5.5 &6.x, WebLogic 7.0, 10.0, Web Sphere 6.1, JBoss4.5 |
Frameworks&
Tools |
Struts1.1/2.0,
JSF, Spring, MVC, ATG, Hibernate, JUnit, JPA, Easy Mock, AJAX, Log4J,
Eclipse, STS, Tibco EMS. |
Web
Technologies |
JSP,
jQuery, XML, JSON, HTML5, XSLT, JavaScript, CSS, DHTML, Servlets, JSF, Ajax,
REST, JSTL |
Databases |
ORACLE,DB2,
Sybase, SQL Server, MYSQL |
Design
& Modeling |
UML,
Design Patterns, Microsoft Visio, Rational Rose 3.0,Agile SCRUM |
Tools/IDES |
RAD
7.5,Net Beans, Eclipse |
Build
Tools |
ANT,
Maven |
Version
Control Tools |
CVS,
SVN, GIT |
JAVA
(backend) |
JAVA
(backend): J2ee, JSF, JSP, Webserices (Rest, SOAP UI), Struts, Spring,
hibernate, servlets |
JAVA
(frontend) UI |
JAVA
(frontend) UI: JavaScript, HTML, CSS, DHTML, REST API, Angular, Jquery,
React js, Bootstrap, Node js, |
Fullstack is
combination of Frontend and backend) |
|
.NET |
.NET: asp.net,
C#, VB.net, Visual Studio, ASP.NET, .net Web
Services, ASP.NET, MVC, AJAX, Classic ASP, JavaScript, VBScript, HTML,
DHTML, XML, CSS, JQuery, webForms and win Forms, |
QA |
JBehave,
Selenium Suite (Selenium IDE, Selenium WebDriver, Selenium Remote Control,
Selenium Grid), HP QTP 8.0, 8.2, 9.0, Segue SilkTest 7.0, Test Complete,
Robot Framework, Visual Studio 2010, 2013 |
Generic
Titles & Tehnologies
Business Analyst (BA) & Business
Systems Analyst
Operating Systems:
Linux
(Alpine, Arch, Debian, Gentoo, Redhat, Ubuntu), Mac OS, UNIX, Windows, AIX,
FreeBSD
RDBMS / NoSQL : ORACLE 10g MS-Access, MySQL, SQL-Server, DB2, MongoDB.
Apple/Iphone dev.:
Environments: Xcode, Objective-C, Swift, Core Data, Realm,
Storyboards, Bluetooth devices, Trello, Slack
Software Developer:
Implementation-Coding
Who involves all phases of the software development (Design /
coding / unit testing)
Certification: It will differs based on development skill / tool
Fullstack is
combination of Frontend and backend)
Front-end & Back-end:
Front – End: It will
interface between the User and Backend… Front-End is responsible for collecting
the inputs in various forms from the user. Between H/W and End-User there is a
different kind of layers… To make end user as most User friendly it will help.
Eg: Java / HTML / ASP / Ruby / Mainframe
Back-End: It’s a Database
which can be used by the End-User in-directly through external application. Use
to store the database. Eg: SQL, Oracle, PSQL, Sybase, DB2 DB
Front-End Technologies:
HTML/HTML5,
CSS/CSS3, JavaScript (NodeJS,
AngularJS, ReactJS, VueJS), Jquery, VBScript, AJAX, Twitter BootStrap/BootStrap4,
Express.js- framework for Node.js, (ASP.net Active Server Pages.NET)
Back End technologies:
C/C++, C#, PHP, Python,
Ruby on Rails(RoR), Java, Sql, Perl
Databases: My SQL, NO Sql, Oracle 12c, PostgreSQL,
MongoDB, MariaDB, IBM DB2, SAP HANA
Databases:
A) SQL
B) NoSQL
The main difference
between these two is that SQL databases, also called Relational Databases
(RDBMS), have relational structure and NoSQL doesn’t use relations. SQL
databases are vertically scalable, which means one ultimate machine will do the
work for you. On the other hand, NoSQL databases are horizontally scalable,
which means multiple smaller machines will do the work for you.
A) SQL (Structured Query Language)/Relational Databases (RDBMS):
MySQL, PostgreSQL, Oracle 12c, Sybase,
Ms SQL server (Microsoft SQL Server), Access, Ingres
B) NoSQL Databases:
1. MongoDB
2. Cassandra
3. Redis
4. HBase
5. Neo4j
6. Oracle NoSQL
7. DynamoDB
8. Couchbase
9. Memcached
10. CouchDB
CMS(Content Management
System):
Wordpress, Joomla,
Drupal, Magento, Blogger, Shopify, Bitrix, Typo3, Squarespace, Prestashop,
DotNetDuke; SQL (MySQL
or PostgreSQL) and NoSQL (MongoDB or Cassandra)
Framework:
basic structure / Platform,
A framework is a collection of programs that do something useful and
which you can use to develop your own applications.
A framework guides you on how to do something (like a predefined way
of doing things)
Full Stack developer:
A full
stack developer is a jack-of-all-trades in servers, databases, systems
engineering, and facing clients.
both front-end and back-end work.
Front end developer (Client side developer),
Skills/Technologies-
Design of user interface (UI) and user experience (UX), CSS, JavaScript, HTML,
and a growing collection of UI frameworks
Back end developer (Server side developer)
Skills/Technologies-
programming languages such as Java, C, C++, Ruby, Perl, Python, Scala, Go, etc.
Back-end developers often need to integrate with a vast array of services such
as databases, data storage systems, caching systems, logging systems, email
systems, etc.
Full
Stack developer:-
Design, develop, deploy, troubleshoot, and debug web solutions and back end
services :Participate in the full development life cycle, including design,
coding, testing and production releases
Database: SQL,
PostgreSQL, NoSql, MongoDB, Cassandra
Front-end is also referred to as the
“client-side” as opposed to the backend which is basically the “server-side” of
the application. The essentials of backend web development include languages
such as Java, Ruby, Python, PHP, . Net, etc. The most common frontend languages
are HTML, CSS, and JavaScript.
Front-end technologies
HTML5, CSS3, JavaScript, ES6
AngularJS, ReactJS, NodeJS
BackboneJS, JQuery, Vue.js
Ajax, Bootstrap, Webpack, GIT
Back-end Technologies:
The back-end has three parts to it:
server, application, and database. In order to handle the back end of given
applications, programmers or back end developers have to deal with back end
technologies that includes languages like PHP, .NET and Java.
Ruby, PHP, Java, .Net, Python, C++
MongoDB, MySQL, PostgreSQL
Express js
#ReactJS VS React Native
React-Native is a framework which used
to create Mobile Apps, where ReactJS is a javascript library you can use for
your website. React Native is same as React, but it uses native components
instead of using web components as building blocks. While Reactjs is basically
a JavaScript library and React Native is the entire framework
#Node. js can
be used in both the frontend and backend of applications.
IDE (Integrated Development
Environment): An IDE is
an application used to write and compile code. A framework is generally a
software component that someone else wrote that you can use/integrate into your
own project, generally to avoid re-inventing the wheel. A framework is a tool
that is closely attached to the language you are using and usually extends upon
or adds the the language features.
Java makes use of frameworks like Hibernate, Struts and Spring to extend the
language and NetBeans or Intellij IDEA bring support for these tools to your
Java project in a structured manor.
Visual Studio is an IDE and .NET is a framework
IDEs, being a development environment, are used to develop
software programs from the scratch
NetBeans, Eclipse, IntelliJ.
Library and Framework:
Both Library and Framework are code written by some developer to
solve a complicated problem efficiently. Their purpose was to increase the
reusability of the code so that you can use the same piece of code or functions
again in your various project.
What is Library?
A Library is a set of code that was previously written by a
developer that you can call when you are building your project.
In Library, you import or call specific methods that you need for
your project.
In simple words, a bunch of code packed together that can be used
repeatedly is known as Library.
Reusability is one of the main reasons to use libraries.
Let's undersand this more clearly with the help of an example.
Think of you as a carpenter who needs to build a table.
Now, you can build a table without the help of tools, but it's
time-consuming and a long process.
What is Framework?
A framework is a supporting structure that gives shape to your
code.
In the Framework, you have to fill the structure accordingly with
your code.
There is a specific structure for a particular framework that you
have to follow, and it's generally more restrictive than Library.
One thing to remember here is that frameworks sometimes get quite
large, so they may also use the Library.
But the Framework doesn't necessarily have to use Library.
Let's get back to our carpenter and table example for a better
understanding of the Framework.
Here, if you want to build a table, then you need a model or
skeleton for how the table looks, like the table has four legs and a top slab.
Now, this is the core structure of the table and you have to work
accordingly to build the table.
Similar to this, Framework also provides the structure, and you
have to write the code accordingly.
JavaScript framework:
A JavaScript framework is a collection
of pre-written JS code libraries that developers can access and use for
creating routine programming functions and features.
The basic use of a JS framework is for
building websites and web applications with ease.
Rather than writing the same code from
scratch, developers can utilize these code libraries for accessing these
programming blocks.
This obviously saves you a lot of time
which you can spend on creating other unique elements of your website.
When you’re working with a JavaScript
framework, you can simply search for functionality in the JS libraries and
directly import its code into your site’s code as required. JavaScript saves
your time and energy.
There are a number of popular
frameworks with different features and uses, and you should choose the one that
fits perfectly for your needs at a given time.
Front-end
Javascripts: React, Angular,
Vue, Embers.js, Svelte.js, Backbone.Js, Aurelia.Js, Polymer.Js, Mithril.Js,
Webix
Top 5 JavaScript Front-End Frameworks:
1. React.js
2. Vue.js 3. Angular.js 4. Ember.js 5. Polymer.js
Back-end
Javascripts: Node.JS,
Express.JS, Hapi, Feather.js, Meteor.js, Total.js
Node.js frameworks is well known for
creating the REST API, desktop applications, proxy servers
Top 5 JavaScript Back-End Frameworks: 1. Express 2. Next.js 3. Meteor 4. Koa 5. Sails
#Node. js can
be used in both the frontend and backend of applications.
JS
testing Framework: Jest, Mocha JS,
Jasmine, Cypress.
Top 5 Java Frameworks: Spring, Apache Struts,
Hibernate, Grails, JSF (JavaServer Faces), Wicket,
GWT (Google Web Toolkit), Dropwizard, Play, Vaadin, Blade.
Top 5 Java Libraries:
Project Lombok, Guava, jOOQ, Apache
Lucene, Mockito, AssertJ.
What is spring boot used for?
Spring Boot is an open source
Java-based framework used
to create a micro Service.
Spring
Boot:
Spring Boot is a module of Spring Framework. It allows
us to build a stand-alone application with minimal or zero configurations.
Spring vs. Spring Boot.
Spring Spring
Boot
Spring Framework is a widely used Java
EE framework for building applications. Spring
Boot Framework is widely used to develop REST APIs.
API (Application Programming
Interface): which is a software intermediary that
allows two applications to talk to each other. API stands for application
programming interface, which is a set of definitions and protocols for building
and integrating application. Each time you use an app like Facebook, send an
instant message, or check the weather on your phone, you're using an API. Ex.:
Weather Snippets, Pay with PayPal, Twitter bot.
Spring Boot as the backend REST API
What is the difference between Web services
and Microservices?
In the simplest of terms,
microservices and web services are defined like this: Microservice: A
small, autonomous application that performs a specific service for a larger
application architecture. Web service: A strategy to make the services of one
application available to other applications via a web interface.
What is REST API vs SOAP?
REST APIs uses multiple standards like
HTTP, JSON, URL, and XML for data communication and transfer. SOAP APIs is
largely based and uses only HTTP and XML. As REST API deploys and uses multiple
standards as stated above, so it takes fewer resources and bandwidth as
compared to SOAP API.
SOAP = PROTOCOL, REST = ARCHITECTURE
What is the difference between Docker
and Microservices?
We will understand the difference
between Docker and Microservices by an analogy. Docker is a Cup or in other
words Container whereas Microservice is the liquid that you pour into it. You
can pour different types of liquids in the same cup. ... Similarly, you can run
many Microservices in same Docker container.
HTML to define the content of web pages.
CSS to specify the layout of web pages.
JavaScript to program the behaviour of web pages.
AJAX stands for Asynchronous JavaScript And XML. In a nutshell, it
is the use of the XMLHttpRequest object to communicate with servers. It can
send and receive information in various formats, including JSON, XML, HTML, and
text files.
AJAX allows web pages to be updated asynchronously by exchanging
small amounts of data with the server behind the scenes. This means that it is
possible to update parts of a web page, without reloading the whole page.
JSON is a text format for storing and transporting data.
XML stands for extensible markup language. A markup language is a
set of codes, or tags, that describes the text in a digital document. The most
famous markup language is hypertext markup language (HTML), which is used to
format Web pages.
XML stands for extensible markup language. A markup language is a
set of codes, or tags, that describes the text in a digital document. The most
famous markup language is hypertext markup language (HTML), which is used to
format Web pages.
SQL is a standard language for storing, manipulating and
retrieving data in databases.
SQL stands for Structured Query Language, SQL lets you access and
manipulate databases
SQL can insert records in a database, SQL can update records in a
database
SQL can delete records from a database, SQL can create new
databases.
SQL can create new tables in a database, SQL can create stored
procedures in a database
DBMS (Database Management System) is essentially nothing more than a computerized data-keeping
system. In DBMS, the data is stored as a file, while in RDBMS, the information
is stored in tables. DBMS can only be used by one single user, whereas multiple
users can use RDMBS.
Some DBMS examples include MySQL, PostgreSQL, Microsoft Access,
SQL Server, FileMaker, Oracle, RDBMS, dBASE, Clipper, and FoxPro.
RDBMS stands for Relational Database Management System.
RDBMS is the basis for SQL, and for all modern database systems
such as MS SQL Server, IBM DB2, Oracle, MySQL, and Microsoft Access.
The data in RDBMS is stored in database objects called tables. A
table is a collection of related data entries and it consists of columns and
rows.
One key feature of RDBMS is that it can only keep the tabular form
of data. Data in RDBMS is stored and sorted in the form of rows, columns.
Top 10 Relational Databases: MS SQL, Oracle Database, MySQL, IBM
Db2, Amazon Relational Database Service (RDS), PostgreSQL, SAP HANA, Amazon
Aurora.MariaDB, Db2 Express-C, SQLite, CUBRID, Firebird, Oracle Database XE,
Sequel Pro, PostgreSQL, SQL Server Express.
OLTP & OLAP:
An OLTP system is an accessible data processing system in today's
enterprises. Some examples of OLTP systems include order entry, retail sales,
and financial transaction systems.
Is SQL OLTP or OLAP?
OLTP and OLAP both are online processing systems. ... OLTP is an
online database modifying system, whereas OLAP is an online database query
answering system.
Python is commonly used
for developing websites and software, task automation, data analysis, and data
visualization.
It is used for:
web development (server-side), software development,mathematics,system
scripting. Use Python for statistical analysis and create data visualizations
to see the big picture.
R is a programming language and free
software environment for statistical computing and graphics supported by the R
Core Team and the R Foundation for Statistical Computing. It is widely used
among statisticians and data miners for developing statistical software and
data analysis.R is a programming language and software environment for
statistical analysis, graphics representation and reporting.
Machine learning is a method of
data analysis that automates analytical model building. It is a branch of
artificial intelligence based on the idea that systems can learn from data,
identify patterns and make decisions with minimal human intervention. Machine
learning (ML) is a type of artificial intelligence (AI) that allows software
applications to become more accurate at predicting outcomes without being
explicitly programmed to do so. Machine learning algorithms use historical data
as input to predict new output values.
Big
Data : Hadoop,
NoSql, Apache Spark, Apache Storm, Cassandra, rapidminer, Mongo DB
Big
data is a term for data sets that are so large or complex that traditional data
processing application softwareis inadequate to deal with them. Big data
challenges include capturing data, data storage, data analysis, search,
sharing, transfer, visualization, querying, updating and information privacy.
Hadoop
is an open source, Java-based programming framework that supports the
processing and storage of extremely large data sets in a distributed computing
environment. It is part of the Apache project
sponsored by the Apache Software Foundation.
Cloud
Computing:
AWS
(Amazon Web Services)/ Microsoft Azure/Google Cloud Platform
IaaS
(Infrastructure-as-a-Service), PaaS (Platform-as-a-Service), SaaS
(Software-as-a-Service)
Azure is
a Paas public cloud computing platform—with solutions including
Infrastructure as a Service (IaaS), Platform as a Service (PaaS), and Software
as a Service (SaaS) that can be used for services such as analytics,
virtual computing, storage, networking, and much more.
Azure DevOps is the evolution of VSTS (Visual Studio Team Services).
Simply
put, cloud computing is the delivery of computing services—servers, storage,
databases, networking, software, analytics and more—over the Internet (“the
cloud”). Companies offering these computing services are called cloud providers
and typically charge for cloud computing services based on usage, similar to
how you are billed for water or electricity at home.
Sales CRM:
The Salesforce cloud is an on-demand customer
relationship management (CRM) suite offering applications for small, midsize
and enterprise organizations, with a focus on sales and support.
Amazon Web Services (AWS) is a secure cloud services
platform, offering compute power, database storage, content delivery and other
functionality to help businesses scale and grow.
Artificial
Intelligence (AI):
NLP Natural Language Processing NLP Ai, AI
artificial intelligence, ML Machine learning, DL DL is a subset of ML, which is
also a subset of AI.
ETL -
Extract, Transform, Load
ETL is short for extract, transform, load,
three database functions that are combined into one tool to pull data out of
one database and place it into another database. Extract is the process
of reading data from a database. In this stage, the data is collected,
often from multiple and different types of sources.Transform is the process
of converting the extracted data from its previous form into the form it
needs to be in so that it can be placed into another database. Transformation
occurs by using rules or lookup tables or by combining the data with other
data.Load is the process of writing the data into the target
database.
How ETL Works
Data
from one or more sources is extracted and then copied to the data warehouse.
When dealing with large volumes of data and multiple source systems, the data
is consolidated.
ETL is
used to migrate data from one database to another, and is often the specific process required to
load data to and from data marts and data warehouses, but is a process that is
also used to to large convert (transform) databases from one format or type to
another.
ETL collects and redefines data, and delivers
them to a data warehouse
The process of ETL plays a key role in data
integration strategies. It is the process of moving raw data from one or more
sources into a destination data warehouse. ETL allows businesses to gather data
from multiple sources and consolidate it into a centralized location. This is
essential in making the data analysis-ready in order to have a seamless
business intelligence system in place.
Popular cloud data warehouses like AWS
DynamoDB, AWS Redshift
1.
Informatica
PowerCenter
2.
IBM InfoSphere
DataStage
3.
Talend
4.
Pentaho
5.
AWS Glue
6.
Azure Data
Factory
Kafka is written in Scala and Java and is often associated with real-time event stream processing for big data.
Kafka is an open source software which provides a framework for
storing, reading and analysing streaming data.
Kafka is used to stream data into data lakes, applications, and real-time stream analytics systems.
Kafka is used
heavily in the big data space as a reliable way to ingest and move large amounts of data very
quickly.
Snowflake is an analytic data warehouse provided as
Software-as-a-Service (SaaS). Snowflake provides a data warehouse that is
faster, easier to use, and far more flexible than traditional data warehouse
offerings.
ETL is a type of data integration that refers to the three steps
(extract, transform, load) used to blend data from multiple sources. It's often
used to build a data .
ETL, for extract, transform and load, is a data integration
process that combines data from multiple data sources into a single, consistent
data.
Amazon Web Services (AWS) - cloud computing platforms
AWS Lambda – Serverless Compute - Amazon Web Services
AWS Lambda is an event-driven, serverless computing platform
provided by Amazon as a part of Amazon Web Services. It is a computing service
that runs code in response to events and automatically manages the computing
resources required by that code. It was introduced in November 2014.
expert who makes high-level design
choices and tries to enforce technical standards, including software coding
standards, tools, and platforms.
Scala Vs Python Vs Spark
Scala is frequently over 10 times faster than Python.
... In case of Python, Spark libraries are called which require
a lot of code processing and hence slower performance. In this
scenario Scala works well for limited cores.
Moreover Scala is native for Hadoop as its based on JVM.
The difference between
Spark and Scala is that th Apache Spark is a cluster
computing framework, designed for fast Hadoop computation while the Scala is
a general-purpose programming language that supports functional and object-oriented
programming. Scala is one language that is used to write Spark.
Is Scala faster than PySpark?
However, this not the only reason why Pyspark is a better choice
than Scala. There's more. Python API for Spark may be slower on the cluster,
but at the end, data scientists can do a lot more with it as compared to Scala.
... This aids in data analysis and also has statistics that are much mature and
time-tested.
SAP BW on HANA
SAP Business Warehouse (BW) powered by HANA also known as BW on
HANA (BWoH) is SAP's data modeling, warehousing, and reporting tool built on
the HANA database. SAP BW on HANA (BWoH) runs on the HANA database and
therefore it is simpler and faster.
SAP BW is also a development platform that programmers use to create and modify data warehouses, perform data
management tasks, generate reports and develop analytics applications. Business
users typically access SAP BW through an application built by a developer, such
as an executive Dashboard or mobile app.
SAP BW on HANA and SAP BW/4HANA are different application suites
running on the same database. BW on HANA uses SAP's legacy BW software, but
moves it to the HANA database, while BW/4HANA uses a re-engineered software
suite designed to fully harness the power of the HANA database.
Types of Big Data Technologies
Before starting with the list of technologies let us first see the
broad classification of all these technologies.
They can mainly be classified into 4 domains.
·
Data storage
·
Analytics
·
Data mining
·
Visualization
Data analysts skills :
Structured Query Language (SQL)
Microsoft Excel
R or Python-Statistical Programming
Data Visualization
Machine Learning
Big Data definition: Big Data meaning a data that is huge in size.
Bigdata is a term used to describe a collection of data that is huge in size
and yet growing exponentially with time. Big Data analytics examples includes
stock exchanges, social media sites, jet engines, etc
TOP 5 BIG DATA TECHNOLOGIES
1. Hadoop Ecosystem
Hadoop Framework was developed to store and process data with a
simple programming model in a distributed data processing environment. The data
present on different high-speed and low-expense machines can be stored and
analyzed. Enterprises have widely adopted Hadoop as Big Data Technologies for their
data warehouse needs in the past year. The trend seems to continue and grow in
the coming year as well. Companies that have not explored Hadoop so far will
most likely see its advantages and applications.
2. Artificial Intelligence
Artificial Intelligence is a broad bandwidth of computer
technology that deals with the development of intelligent machines capable of
carrying out different tasks typically requiring human intelligence. AI is
developing fast from Apple’s Siri to self-driving cars. As an interdisciplinary
branch of science, it takes into account a number of approaches such as
increased Machine Learning and Deep Learning to make a remarkable shift in most
tech industries. AI is revolutionizing the existing Big Data Technologies.
3. NoSQL Database
NoSQL includes a wide variety of different Big Data Technologies
in the database, which are developed to design modern applications. It shows a
non-SQL or non-relational database providing a method for data acquisition and
recovery. They are used in Web and Big Data Analytics in real-time. It stores
unstructured data and offers faster performance and flexibility while
addressing various data types—for example, MongoDB, Redis and Cassandra. It
provides design integrity, easier horizontal scaling and control over
opportunities in a range of devices. It uses data structures that are different
from those concerning databases by default, which speeds up NoSQL calculations.
Facebook, Google, Twitter, and similar companies store user data terabytes
daily.
4. R Programming
R is one of the open-source Big Data Technologies and programming
languages. The free software is widely used for statistical computing,
visualization, unified development environments such as Eclipse and Visual
Studio assistance communication. According to experts, it has been the world’s
leading language. The system is also widely used by data miners and
statisticians to develop statistical software and mainly data analysis.
5. Data Lakes
Data Lakes means a consolidated repository for storage of all data
formats at all levels in terms of structural and unstructured data.
Data can be saved during Data accumulation as is without being
transformed into structured data. It enables performing numerous types of Data
analysis from dashboards and Data visualization to Big Data transformation in
real-time for better business interference.
Businesses that use Data Lakes stay ahead in the game from their
competitors and carry out new analytics, such as Machine Learning, through new
log file sources, data from social media and click-streaming.
This Big Data technology helps enterprises respond to better
business growth opportunities by understanding and engaging clients, sustaining
productivity, active device maintenance, and familiar decision-making to better
business growth opportunities.
4) EMERGING BIG DATA TECHNOLOGIES
1. TensorFlow
TensorFlow has a robust, scalable ecosystem of resources, tools,
and libraries for researchers, allowing them to create and deploy powerful
Machine Learning applications quickly.
2. Beam
Apache Beam offers a compact API layout to create sophisticated
Parallel Data Processing pipelines through various Execution Engines or
Runners. Apache Software Foundation developed these tools for Big Data in the
year 2016.
3. Docker
Docker is one of the tools for Big Data that makes the
development, deployment and running of container applications simpler.
Containers help developers stack an application with all of the components they
need, such as libraries and other dependencies.
4. Airflow
Apache Airflow is a Process Management and Scheduling System for
the management of data pipelines. Airflow utilizes job workflows made up of
DAGs (Directed Acyclic Graphs) tasks. The code description of workflows makes
it easy to manage, validate and version a large amount of Data.
5. Kubernetes
Kubernetes is one of the open-source tools for Big Data developed
by Google for vendor-agnostic cluster and container management. It offers a
platform for the automation, deployment, escalation and execution of container
systems through host clusters.
6. Blockchain
Blockchain is the Big Data technology that carries a unique data
safe feature in the digital Bitcoin currency so that it is not deleted or
modified after the fact is written. It’s a highly secured environment and an
outstanding option for numerous Big Data applications in various industries
like baking, finance, insurance, medical and retail, to name a few.
----
Apache Hadoop. It is the topmost big data tool. ...
Apache Spark. Apache Spark is another popular open-source big data
tool designed with the goal to speed up the Hadoop big data processing. ...
MongoDB. ...
Apache Cassandra. ...
Apache Kafka. ...
QlikView. ...
Qlik Sense. ...
Tableau.
Data warehousing:
A data warehouse is constructed by integrating data from multiple
heterogeneous sources that support analytical reporting, structured and/or ad
hoc queries, and decision making. Data warehousing involves data cleaning, data
integration, and data consolidations.
In computing, a data warehouse (DW or DWH), also known as an
enterprise data warehouse (EDW), is a system used for reporting and data
analysis and is considered a core component of business intelligence. DWs are
central repositories of integrated data from one or more disparate sources
What is a data warehouse vs database?
What are the differences between a database and a data warehouse?
A database is any collection of data organized for storage, accessibility, and
retrieval. A data warehouse is a type of database the integrates copies of
transaction data from disparate source systems and provisions them for
analytical use.
What is data warehouse and how does it work?
A data warehouse contains data from many operational sources. It
is used to analyze data. Data warehouses are analytical tools, built to support
decision making and reporting for users across many departments. ... Data
warehouses work to create a single, unified system of truth for an entire
organization.
Data modeling:
Data modeling is the process of creating a visual representation
of either a whole information system or parts of it to communicate connections
between data points and structures.
Big Data meaning a data that is huge in size. Bigdata is a term
used to describe a collection of data that is huge in size and yet growing
exponentially with time. Big Data analytics examples includes stock exchanges,
social media sites, jet engines, etc.
Apache Hadoop is an open source framework that is used to efficiently
store and process large datasets ranging in size from gigabytes to petabytes of
data. Instead of using one large computer to store and process the data, Hadoop
allows clustering multiple computers to analyze massive datasets in parallel
more quickly.
What is Hadoop and Big Data?
Hadoop is an open source, Java based framework used for storing and processing big data.
The data is stored on inexpensive commodity servers that run as clusters. ...
Cafarella, Hadoop uses the MapReduce programming model for faster storage and
retrieval of data from its nodes.
Hadoop is a database
framework, which allows users to save, process Big Data.
Kafka:
Kafka can handle huge volumes of data and remains responsive, this
makes Kafka the preferred platform when the volume of the data involved is big
to huge. ... Kafka can be used for real-time analysis as well as to process
real-time streams to collect Big Data.
Apache Kafka is an open-source distributed event streaming
platform used by thousands of companies for high-performance data pipelines,
streaming analytics, data integration, and mission-critical applications.
What is Kafka used for?
Kafka is primarily used to build real-time streaming data pipelines and
applications that adapt to the data streams. It combines messaging, storage,
and stream processing to allow storage and analysis of both historical and
real-time data.
What is difference between
Kafka and MQ?
While ActiveMQ (like IBM MQ or JMS in general) is used for
traditional messaging, Apache Kafka is used as streaming platform (messaging +
distributed storage + processing of data). Both are built for different use
cases. You can use Kafka for "traditional messaging", but not use MQ
for Kafka-specific scenarios.
Kafka is a Message broker. Spark is the open-source platform.
Kafka has Producer, Consumer, Topic to work with data. ... So Kafka is used for
real-time streaming as Channel or mediator between source and target.
Spark: is a fast and general engine for large-scale data processing.
Apache Spark is an open-source unified analytics engine for
large-scale data processing. Spark provides an interface for programming entire
clusters with implicit data parallelism and fault tolerance.
Scala:
Scala is used in Data processing, distributed computing, and web development. It
powers the data engineering infrastructure of many companies.
Hadoop Ecosystem:
Some of the most well-known tools of the Hadoop ecosystem include HDFS,
Hive, Pig, YARN, MapReduce, Spark, HBase, Oozie, Sqoop, Zookeeper.
Databricks provides a
unified, open platform for all your data. It empowers data scientists, data
engineers, and data analysts with a simple collaborative environment to run
interactive, and scheduled data analysis workloads.
Data bricks is an enterprise software company founded by the
creators of Apache Spark.The company has also created Delta Lake, MLflow and
Koalas, open source projects that span data engineering, data science and
machine learning.
Azure Databricks is a data
analytics platform optimized for the Microsoft Azure cloud services platform.
... For a big data pipeline, the data (raw or structured) is ingested into
Azure through Azure Data Factory in batches, or streamed near real-time using
Apache Kafka, Event Hub, or IoT Hub.
Is
Databricks a competitor of Snowflake?
Databricks and Snowflake are direct
competitors in cloud data warehousing, although both shun that term. Snowflake
now calls its product a “data cloud,” while Databricks coined the term
“lakehouse” to describe a fusion between free-form data lakes and structured
data warehouses.
No comments:
Post a Comment