Writing Homework Help

MIS 650 Grand Canyon University User Defined Aggregate Functions Responses

 

Response 1: David Lundholm

Within R, aggregate functions split data into subsets and computes a summary of statistics for each subset in an easy-to-read formant. I used the below steps to create a connect with SQL and establish a connection to the AdventureWork database to form a data frame. Using the aggregate function to return a subtotal (sum) for aggregated columns Color and Quantity. The results will show the total inventory by color in stock.

library(odbc)

# Create a connection object

con <- DBI::dbConnect(odbc::odbc(),

Driver=”ODBC Driver 17 for SQL Server”,

Server=”DESKTOP-G2UGCQR”,

Database=”AdventureWorks2017″,

UID = “dave1”,

PWD = “hercules24”)

# Create query

Example1 = dbGetQuery(con,

“Select P.Color, I.Quantity

From Production.Product P

INNER JOIN Production.ProductInventory I

On P.ProductID=I.ProductID”)

# aggregate of product by color

total_stock_color <- aggregate(Color$Quantity, by =list(Color$ProductID), FUN = sum)

#Create column headers

colnames(total_stock_color)<-c(“ProductID”,”Quantity”)

#Print Results

total_stock_color

Response 2: Tyler Klingler

In the example below, I calculated the total sales-driven-revenue for each year (from 2011 to 2014). This information and similar information can be incredibly beneficial to a company so I definitely believe that the aggregate function is a great one to learn.

#Load libraries

pacman::p_load(“dplyr”, “odbc”, “openxlsx”, “lubridate”)

#Connect to the database

con = DBI::dbConnect(odbc::odbc(), driver=”SQL Server”, server=”DESKTOP-VBB26S1\GCUSQLSERVER”, Database=”AdventureWorks2016″, Trusted_Connection=”True”)

#Import query exactly as written in SQL

query <- dbGetQuery(con, “Select AdventureWorks2016.Sales.SalesOrderDetail.SalesOrderID, AdventureWorks2016.Sales.SalesOrderDetail.OrderQty,

AdventureWorks2016.Sales.SalesOrderDetail.ProductID, AdventureWorks2016.Sales.SalesOrderDetail.UnitPrice,

AdventureWorks2016.Sales.SalesOrderHeader.SubTotal, AdventureWorks2016.Sales.SalesOrderHeader.TaxAmt,

AdventureWorks2016.Sales.SalesOrderHeader.Freight, AdventureWorks2016.Sales.SalesOrderHeader.TotalDue,

AdventureWorks2016.Sales.SalesOrderHeader.OrderDate

From Sales.SalesOrderDetail

Inner Join Sales.SalesOrderHeader

On Sales.SalesOrderHeader.SalesOrderID = Sales.SalesOrderDetail.SalesOrderID”)

#aggregate -> Column you want to group by, column you want aggregated, aggregate function

total_sales_by_year <- aggregate(query$TotalDue, by = list(year(query$OrderDate)), sum)

#rename each column for readability

colnames(total_sales_by_year) <- c(“Year”, “Total Sales Revenue”)

#write to an xlsx file to open up in Excel

write.xlsx(total_sales_by_year, “TotalSalesByYear.xlsx”)

Response 3: Arcelia Rael

DataScience(n.d.) writes that the aggregate function works like the “Group By” statement in SQL in that it allows the aggregation of data to compute statistical analysis. In the R sample below, we pull our dataset from the AdventureWorks 2019 database using the dbGetQuery function. We then create a truncated version of the dataset to remove product names from the calculations. Using the aggregate function with the method argument equal to “max”, we return the max unit price for each product.

library(odbc)

#create connection

con = DBI::dbConnect(odbc::odbc(),

Driver=”SQL Server”,

Server=”DESKTOP-64VBUAO”,

Database=”AdventureWorks2019″,

Trusted_Connection=”True”)

#create dataframe using our connection to SQL Server

q2 = dbGetQuery(con,

” SELECT

Name,

UnitPrice

FROM [AdventureWorks2019].[Sales].[SalesOrderDetail] sod

INNER JOIN [AdventureWorks2019].[Sales].[SalesOrderHeader] soh

ON soh.SalesOrderID = sod.SalesOrderDetailID

INNER JOIN [AdventureWorks2019].[Production].[Product] p

ON p.ProductID = sod.ProductID”)

#aggregate must only have numerics, so we take out the variable name

q2_num = q2[-1]

str(q2_num)

#we use aggregate function to calculate max of q2_num data frame variables

q2Agg= aggregate(q2_num,

by = list(q2$Name),

FUN = max)

#we rename the column headers

colnames(q2Agg) <- c(“Product Name”, “Max Unit Price”)

#we print the output

q2Agg

References

DataScience. (n.d.). Aggregate() function in R. DataScience Made Simple. Retrieved August 19, 2021, from https://www.datasciencemadesimple.com/aggregate-function-in-r/

  • TK