The system was obtained from numerous sources to

data used to research the impact on speed limits and accidents derives from the
National Highway Traffic Safety Administration (NHTSA)
and the Fatality Analysis Reporting Systems (FARS), which is a part of the U.S.
Department of Transportation. The agency has collected data on accidents beginning
in 1975. FARS collects data within the 50 States on all motor vehicle crashes.  Data for the reporting system was obtained from
numerous sources to include police crash records, state highway department data,
and vital state statistic records. From the documents, the analysts code more
than 100 FARS data elements. Each analyst interprets and codes data directly
onto an electronic data file. The data are automatically checked when entered
for acceptable range values and for consistency, enabling the analyst to make
corrections immediately. (Transportation, 2017)

limitation of this data is that the source documents have the potential to be
biased. When completing accident reports, if speeding is suspected in the
accident it is marked as a factor. This has a significant chance of being subjective
by the person completing the paperwork. Unless there was a speed detection
device present, the factor of speeding relies on personal reporting or subjective
analysis. Another level of collection bias on the same topic comes from the
interpretation of speeding under current conditions. The appropriate speed for
road conditions is subject to each driver, and it is likely to be over reported
in the event of an accident. It is easy to attribute an accident to speeding if
the weather is poor. Under such a scenario, speeding will be overreporting and

We Will Write a Custom Essay Specifically
For You For Only $13.90/page!

order now

data from NHTSA has a potential for measurement error throughout its collection
process. From the law enforcement officer completing the accident report to the
analyst updating the FARS system. Mistakes can be come and can be aggregated across
the state and nation. FARS has automatic detection software that is utilized during
inputting to minimize the change of inputting error, but I suspect that it is
unlikely a system to monitor the reporting at the law enforcement officer
level. Additionally, from state to state and county to county, accident reports
are likely completed with different standards. The information that gets
reported and frequency is likely to differ from one unit to the next. These
factors lead to biases in the data parameters and need to be considered in final

order to prepare the raw accident data for regression analysis the variable data
are aggregated at the state level. The numerous accident observations are
combined into aggregate observations. This dramatically reduces the number of data
observations and limits the predictive capabilities of the analysis and reliability
of the research conclusion.

In order to
minimize the data limitations, a fixed effect panel data regression is going to
be chosen as the empirical method for analysis. As discussed prior, the data suffers
from the aggregation of accidents into observations for this analysis. However,
this is outweighed by the benefits of the panel data regression. Panel Data
offer some important advantages over cross-sectional only analysis. It allows
for simplifying statistical inference in many cases through the use of the
multi-dimensional method of panel data. Using this empirical method will
minimize the bias inherent in specific state data. Panel data gives more
informative data, more variability, less colinearity among the variables, more
degrees of freedom and more efficiency

Timeseries research
are commonly plagued with multicollinearity. This is less common in panel data across
states.  Panel
data allows you to control for variables you cannot observe or measure like
cultural factors or difference in business practices across companies; or
variables that change over time but not across entities.

Although panel data
is the most sutiable analysis for this research its limitiation in this research
comes from its  short time-series
dimension. The panel for this analysis involve annual data from a relatively  short span of time. Increasing the time span
of the panel is not without cost either. In fact, this increases the chances of
attrition and increases the computational difficulty for limited dependent
variable panel data models.

With the benefits and limitations of panel data in mind, the
model formulated for this research is as follows:


 Fatalitiesi = ?0 + ?1
(Roadway) + ?2(Roadside) +
?3(Atmospheric) + ?4(Intersection)
+ ?5(Speeding)


Fatalities –       Total number of fatally injured persons in

Roadway –       Total number of accidents occurring on

Roadside –       Total number of accidents occurring on

Atmospheric –          Total number of accidents involving prevailing
atmospheric conditions of Snow/Rain/Fog at the time of the crash

Intersection-    Total Number of accidents involving

Speeding-        Total
number of accidents where driver’s speed was related to the crash as indicated
by law enforcement.


Preliminary results from Stata software using data from NHTSA
provides limited valuable information. Suggesting no relationship between the
distance to an intersection and the likelihood of a fatality. The analysis
provided the following output:

Fatalities =
15.59 + 1.03 (Roadway) + .72(Roadside) +
.37(Atmospheric) -.17(Intersection) + .12(Speeding).

is little intuitive interpretation to the coefficients. Roadway and Roadside are
the only statistical significant variables in the regression and provide no
insight into accidents, speed limits and fatalities. Some portion of accidents
on the roadway are obviously going to result in fatalities. The results of the
initial regression provide no insights into the overall hypothesis because
there is no significance to the intersection nor speeding. The results provide
inconclusive evidence as to whether the relation to intersections impact fatalities.

a general assumption it believed that the time-invariant component of the error
term is correlated with the independent variables in the regression model. This
is to state that there is a violation of assumption that the error term is
uncorrelated with the independent variables. This violation would result in
biased parameter estimates and estimates that are not BLUE. This was resolved
using fixed effects regression. The intuition behind this estimation method is
that the estimated coefficients for the individual dummy variables provide
estimates for each individual, thereby providing a simple way of removing the
individual component from the error term and putting it directly into the regression
model. Doing so increases the efficiency of our estimates by exploiting the
panel nature of the data to resolve the unobserved component of the error term
that does not change over time.

panel data regressions, limit the impact of multicollinearity, after reviewing
the collinearity in the variables resulted in the variables Roadway and
Intersection being correlated at a .9755 level. 
In hindsight, it seems obvious that the two variables are highly related
and likely redundant and create a over-specified equation. Almost near multicollinearity
violates on of OLS assumptions. The greater the multicollinearity, the greater
the level standard errors. When high multicollinearity is present, confidence
intervals for coefficients tend to be very wide.

After testing for multicollinearity
and maintaining the fixed effects regression an updated regression was produced
as follows:       

            Fatalitiesi = ?0 +
?1(Atmospheric) + ?2(Intersection)
+ ?3(Speeding)


The secondary results provided the following coefficients:


Fatalitiesi =
30.35 + 1.29(Atmospheric) + 1.34(Intersection)
+ 1.35(Speeding)


The coefficients
of Intersection and speeding where significant at the 5% and 10% level,
respectively. The secondary results suggest that accidents involving
intersections result in greater fatalities. The removal of the variable Roadway
affected the coefficient of intersection and making it statistically significant
in the secondary regression. The preliminary results provided no insights into the
interaction of intersections in fatalities and the secondary results suggest that
they are an important factor in statewide fatalities. 


I'm Harold!

Would you like to get a custom essay? How about receiving a customized one?

Check it out