scenario includes daily temperatures indicating W r i t i n g
Discussion 1: Royce
When looking into just the high temperatures for days in a year, I believe a histogram would be a good chart to use. Since it will graph the temperatures in bar charts for a range of temperatures. Knowing that it should end up with a bell curve hopefully too because of the cyclical temperature cycle of hot and cold only once but the middle temperatures happening twice when transitioning seasons. The only downside would be just like our text says (Piegorsch, pg. 84) “that the binning operation central to its production is highly subjective. That is, the final graphic can depend heavily on the choice and location of the bins” so the person creating the graph can change the quality of the graph.
Another alternative would be a box and whisker plot because that would give us a quicker summary of the temperatures because the end of the whiskers shows us the high and low temps, and the box then splits it up to show us the mean, and the quartiles where the majority of our data will lie. When it comes to binning the average high into 52 week, one benefit would be that we could even then use a stem and leaf plot since there is less data. With the 365 observations, the idea of a stem and leaf would be too chaotic. So doing something like an average for a week and only having 52 observations make it to where our data is more manageable for charts that require a little less data. If you wanted to do the same for quarters it becomes a little harder because you can end up with outliers depending on the season like the second quarter where April can still be really cold and then June ends up being summer and really hot especially closer to the end of June. I would try to not use quarters as our data to chart.
Piegorsch, W. (2015). Statistical data analytics foundations for data mining, informatics, and knowledge discovery. John Wiley & Sons Ltd
Discussion 2: Arceila
According to Lewinson (2020), time plots, seasonal plots, and autocorrelation plots are best used to identify patterns, outliers, and relationships between variables over a period. Specifically, time plots allow for time series analysis, while seasonal plots (as well as polar seasonal) allow the analyst to analyze the data over a season (this is defined by the analyst as needed) (Lewinson, 2020). For example, with a seasonal plot, we go from plotting the data from years to months which may have the advantage of showing us trends and patterns over months versus years.
Given our example, it would be beneficial to use a season plot to plot our data into both weeks and quarters because each of these groupings will provide a different view of our data. For example, if we were to plot the average temperature high for each week of the year, we would be able to identify temperature trends and changes over each month. Specifically, we could answer the question: which month is the hottest? How quickly does the average temperature change between any given week during fall (September through December)?
Plotting the data by quarters with a seasonal plot may allow the analyst to complete a time analysis that aligns with the four seasons. Assuming we had data for more than just one year, we could use the median instead of averaging our quarterly temperate highs to assess changes over the years. For example, what trend can we see over the past ten years during the fall months? Are temperature highs increasing or decreasing?
Lewinson, E. (2020, May 25). 5 types of plots that will help you with time series analysis. Towards Data Science. Retrieved December 4, 2021, from https://towardsdatascience.com/5-types-of-plots-that-will-help-you-with-time-series-analysis-b63747818705
Discussion 3: David
Good evening class,
Considering that the data in this scenario includes daily temperatures indicating the “high” point of each day for 2015, it is important that the selected plot(s) are applied effectively to gain valuable insights into the patterns and trends related to temperature. In order to show how high temperatures differ over time, a line graph, time series graph, histogram, or bar graph may be utilized. Binning the data into one of 52 weeks and plotting the average high temperature for each week allows us to reduce the amount of numerical values required to represent temperature and time for both the x-axis and y-axis. In addition, binning the data helps improve the overall performance and accuracy of the visualizations or model. “Putting numeric data into bins is a useful technique for summarising, especially for continuous data” (Ross, 2020). Depending on the question or problem that needs to be addressed, plotting the average high temperature for each of the four quarters can also significantly reduce the amount of numerical values needed for the graph and answer questions related to high temperatures on a quarterly or yearly basis; however, binning data into four quarters does not necessarily provide insightful trends for high temperatures over a shorter period of time.
Ross, E. (2020, April 24). Excel Binning. Retrieved from https://skeptric.com/excel-binning/