home - Beekeeping
Main stages of statistical research. Methods and main stages of statistical research

1. STAGES OF STATISTICAL RESEARCH

The process of studying socio-economic phenomena through a system of statistical methods and quantitative characteristics - a system of indicators - is called statistical research.

The main stages of the statistical research are:

1) statistical observation;

2) summary of the data obtained;

3) statistical analysis.

If necessary, a statistical study may contain an additional stage - a statistical forecast.

Statistical observation – scientifically organized collection of data on phenomena and processes public life through registration according to a pre-developed observation program of their essential features. Observation data represents primary statistical information about the observed objects, which is the basis for obtaining their general characteristics. Observation acts as one of the main methods of statistics and as one of the most important stages of statistical research.

Conducting a statistical study is impossible without a high-quality information base obtained during statistical observation. Therefore, since the change in ideas about statistics as a descriptive science, special rules for conducting observations and special requirements for its results - statistical data - have been developed. That is, observation is one of the main methods of statistics.

Observation is the first stage of statistical research, the quality of which determines the achievement of the final objectives of the study.

1.1. Observation is carried out according to a specially prepared program.

The program includes a list of characteristics of the research object, data about which must be obtained as a result of observation.

When preparing an observation, it is necessary to determine in advance:

1. An observation program in which:

a) the object of observation is determined, i.e. that set of units of a phenomenon that needs to be investigated. Moreover, it is necessary to distinguish the observation unit from the reporting unit. A reporting unit is a unit providing statistical data; it may consist of several population units, or may coincide with a population unit. For example, in a population survey, the unit might be the household member and the reporting unit might be the household.

b) the boundaries of the observation object are determined.

c) the characteristics of the object of observation are identified, information about which must be obtained as a result of observation.

2. Time of observation of an object - the time as of which or for which information about the object being studied is recorded.

3. Timing of observation. That is, the period of time for data collection and the date of completion of observation are determined. The observation period affects the completion time of the overall statistical study and the timeliness of its conclusions.

4. Funds and resources required for monitoring: number of qualified specialists; material resources; means for processing observation results.

5. Requirements for statistical data. The main requirements are: a) reliability, i.e. information about the object of study should reflect its real state at the time of observation; b) comparability of data, i.e. information obtained as a result of observation must be comparable, which is ensured by a unified methodology for collecting and analyzing data, by units of measurement, etc.

1.2. There are several types of statistical observation.

1. By coverage of population units:

a) solid;

b) non-continuous (selective, monographic, based on the bulk method)

2. According to the time of registration of facts: a) current (continuous); b) discontinuous (periodic, one-time)

3. By the method of collecting information: a) direct observation; b) documentary observation; c) survey (questionnaire, correspondent, etc.)

Summary is the process of bringing the received data into the system, processing it and calculating intermediate and general results, calculating interrelated quantities of an analytical nature.

The next stage of statistical research is the preparation of information obtained during observation for analysis. This stage is called summary.

Summary includes:

— systematization of information obtained during observations;

— their grouping;

— development of a system of indicators characterizing educated groups;

— creation of development tables for grouped data;

— calculation of derived quantities using development tables.

In the literature on the theory of statistics, one often encounters consideration of summary and grouping as independent stages of research. However, it should be noted that the concept of summary includes actions to group statistical data, so here the concept of “summary” is adopted as the name of the research stage.

Statistical analysis - research characteristic features structures, connections between phenomena, trends, patterns of development of socio-economic phenomena, for which specific economic-statistical and mathematical-statistical methods are used. Statistical analysis concludes with the interpretation of the results obtained.

Statistical forecast is a scientific identification of the state and probable paths of development of phenomena and processes, based on a system of established cause-and-effect relationships and patterns.

EXERCISE 1

As a result of a sample survey wages The following data were obtained from 60 employees of an industrial enterprise (Table 1).

Construct an interval distribution series based on the effective attribute, forming five groups with equal intervals.

Determine the main indicators of variation (variance, standard deviation, coefficient of variation), the average power value (the average value of the characteristic) and structural averages. Represent it graphically in the form of: a) a histogram; b) cumulates; c) ogives. Draw a conclusion.

SOLUTION

1. Let us determine the scope of variation according to the effective attribute - according to production experience using the formula:

R = Хmax – Хmin = 36 – 5 = 31

where Хmax – maximum size assets

Хmin – minimum asset size

2. Determine the size of the interval

i = R/n = 31/5 = 6.2

Taking into account the obtained intervals, we group the banks and obtain

3. Let's build an auxiliary table

Recognition group

Meaning of values ​​in a group

x i

Quantity of characteristic frequency (frequency)

f i

in % of total

ω

Cumulative frequency

S i

Middle of the interval

*fi

ω

I

5 – 11,2

6,8,7,5,8,6,10,9,9,7, 6,6,9,10,7,9,10,10, 11,8,9,8, 7, 6, 9, 10

43,3

43,3

210,6

350,73

46,24

1202,24

II

11,2 – 17,4

16,15,13,12,14,14, 12,14,17,13,15,17, 14

21,7

14,3

185,9

310,31

0,36

4,68

III

17,4 – 23,6

18,21,20,20,21,18, 19,22,21,21,21,18, 19

21,7

86,7

20,5

266,5

444,85

31,36

407,68

IV

23,6 –29,8

28,29,25,28, 24

26,7

133,5

221,61

11,8

139,24

696,2

V

29,8 – 36

36,35,33,

32,9

98,7

164,5

TOTAL

895,2

1492

541,2

3282,8

4. The average value of a characteristic in the population under study is determined by the arithmetic weighted formula:

of the year

5. The variance and standard deviation of a characteristic are determined by the formula



Determination of variability


Thus, V>33.3%, therefore, the population is heterogeneous.

6. Definition of fashion

Mode is the value of a characteristic that occurs most frequently in the population being studied. In the interval variation series under study, the mode is calculated using the formula:


Where

x M0
– lower limit of the modal interval:

i M0– the value of the modal interval;

f M0-1 f M0 f M0+1– frequencies (frequencies) of modal, pre-modal and post-modal intervals, respectively.

A modal interval is the interval that has the greatest frequency (frequency). In our problem, this is the first interval.


7. Calculate the median.

Median is an option located in the middle of an ordered variation series, dividing it into two equal parts, such that half of the population units have attribute values ​​less than the median, and half more than the median.

In an interval series, the median is determined by the formula:


where is the beginning of the median interval;

– value of the median interval

– frequency of the median interval;

– the sum of accumulated frequencies in the pre-median interval.

The median interval is the interval in which the serial number of the median is located. To determine it, it is necessary to calculate the sum of the accumulated frequencies to a number exceeding half of the totality.

According to gr. 5 of the auxiliary table we find the interval in which the amount of accumulated amounts often exceeds 50%. This is the second interval - from 11.6 to 18.4, and it is the median.

Then


Consequently, half of the workers with work experience are less than 13.25 years, and half have more than this value.

6. Let us depict the series in the form of a polygon, a histogram, a cumulative line, or an ogive.

Graphic representation plays an important role in the study of variation series, as it allows one to analyze statistical data in a simple and visual form.

There are several ways graphic image series (histogram, polygon, cumulate, ogive), the choice of which depends on the purpose of the study and on the type of variation series.

A distribution polygon is mainly used to depict a discrete series, but you can also construct a polygon for an interval series if you first convert it to a discrete series. The distribution polygon is a closed broken line in a rectangular coordinate system with coordinates (x i, q i), where x i is the value of the i-th feature, q i is the frequency or frequency of the i-ro feature.

A distribution histogram is used to display an interval series. To construct a histogram, segments equal to the intervals of the characteristic are laid out sequentially on the horizontal axis, and on these segments, as on bases, rectangles are constructed, the heights of which are equal to the frequencies or particulars for a series with equal intervals, densities; for a series with unequal intervals.


Cumulates are a graphical representation of a variation series, when accumulated frequencies or particulars are plotted on the vertical axis, and characteristic values ​​are plotted on the horizontal axis. The cumulate is used for graphical representation of both discrete and interval variation series.


Conclusion: Thus, the main indicators of variation of the series under study were calculated: the average value of the attribute - production experience is 14.9 years, the dispersion was calculated to be 54.713, in turn, the standard deviation of the attribute is 7.397. The mode has a value of 9.13, and the modal interval is the first interval of the series being studied. The median of the series, equal to 13.108, divides the series into two equal parts, indicating that in the organization under study, half of the employees have less than 13.108 years of work experience, and half have more.

TASK 2

The following initial data are available that characterize the dynamics for 1997 – 2001. (table 2).

Table 2 Initial data

Year

1997

1998

1999

2000

2001

Production of granulated sugar, thousand tons

1620

1660

1700

1680

1700

Determine the main indicators of the dynamics series. Present the calculation in the form of a table. Calculate the average annual values ​​of the indicators. In the form of a graphic image - a polygon, indicate the dynamics of the analyzed indicator. Draw a conclusion.

SOLUTION

Given

Year

Years

1997

1998

1999

2000

2001

1620

1660

1700

1680

1700

1) The average level of dynamics is calculated using the formula


2) Chain and base growth rates are calculated as follows:

1. Absolute growth is determined by the formula:

Аib = yi – y0

Aic = yi – yi-1

2. The growth rate is determined by the formula: (%)

Trb = (yi / y0) *100

Trc = (yi / yi-1)*100

3. The growth rate is determined by the formula: (%)

Тnрb = Трb –100%:

Tnrts = Trts – 100%

4. Average absolute increase:


y n
– final level of the dynamic series;

y 0
– initial level of the dynamic series;

n c
– number of chain absolute increases.

5. Average annual growth rate:


6. Average annual growth rate:


3) Absolute content of 1% increase:

A = Xi-1/100

We summarize all calculated indicators in a table.

Indicators

Years

1997

1998

1999

2000

2001

Quantity surgical operations during the period

1620

1660

1700

1680

1700

2. Absolute increase

Aic

3. Growth rate

Trib

102,5

104,9

103,7

104,9

Trits

102,5

102,4

98,8

101,2

4. Growth rate

Тпib

Tpitz

5. Value of 1% increase

16,2

16,6

17,0

16,8

5) Average annual value


7. Let's depict it graphically as a polygon.


Thus, the following is obtained. The largest absolute and relative increase in surgical operations for the period was in 1999 and amounted to 1700, the absolute increase compared to the base year was 80 operations, the growth rate compared to the base year 1997 was 104.9%, and the base growth rate was 4.9 %. The largest chain absolute increases were in 1998 and 1999 – 40 operations each. The highest chain growth rate was observed in 1998 - 102.5%, and the lowest chain growth rate in the number of transactions was in 2000 - 98.8%.

TASK 3

There is data on sales of goods (see table 3)

Table 3 Initial data on sales of goods

Product

Base year

Reporting year

quantity

price

quantity

price

1100

1000

1350

1300

1650

1700

Determine: a) individual indices ( i p , i q); b) general indices (I p, I q, I pq); c) absolute change in trade turnover due to: 1) the number of goods; 2) prices.

Draw a conclusion based on the calculated indicators.

SOLUTION

Let's create an auxiliary table

View

Basic

Reporting

Work

Indexes

Quantity, q 0

Price, p 0

Quantity, q 1

Price, p 1

q 0 * p 0

q 1 * p 1

i q =q 1 /q 0

i p =p 1 /p 0

q 1 * p 0

44000

35000

0,875

0,909

38500

1100

1000

41800

40000

0,909

1,053

38000

7500

8400

1,200

0,933

9000

1350

1300

40500

26000

0,667

0,963

27000

45000

44000

1,100

0,889

49500

1650

1700

26400

25500

1,030

0,938

27200

TOTAL

205200

178900

189200


Conclusion: As we can see, the total increase in trade turnover for the year amounted to (-26,300) conventional units, including the impact of a change in the quantity of goods sold by - 16,000 and due to changes in the price of goods - 10,300 conventional units. The overall increase in trade turnover was 87.2%. It should be noted that according to the calculated indices of the quantity of goods by assortment, there is a slight increase in turnover for product “P” by 120% and product “C” by 110%, a slight increase in sales of product “T” is only 103%. Sales of goods “P” decreased quite significantly - only 66.7% of sales in the base year, sales of goods “N” were slightly higher - 87.5% and goods “O” - 90.9% of the corresponding indicator for the base year. The individual price index shows that the price increased only for product “O” - by 105.3%, while at the same time for all other product names - “N”, “P”, “R”, “S”, “T” the individual price index indicates negative dynamics (decrease), respectively – 90.9%; 93.3%;, 96.3%, 88.9; 93.8.

The general index of physical sales volume indicates a slight decrease in total sales volume by 94.6%; the general price index indicates a general decrease in the price of goods sold by 92.2%, and the general trade turnover index indicates a general decrease in trade turnover by 87.2%.

TASK 4

From the initial data of table No. 1 (select rows from 14 to 23) based on two characteristics - length of service and wages - conduct a correlation-regression analysis, determine the parameters of correlation and determination. Construct a graph of the correlation between two characteristics (resultative and factorial). Draw a conclusion.

SOLUTION

Initial data

Production experience

Salary amount

1800

2500

1750

1580

1750

1560

1210

1860

1355

1480

Straight-line dependence

The equation parameters are determined using the method least squares, according to the system of normal equations


To solve the system we use the method of determinants.

Parameters are calculated using formulas

State educational institution

Higher professional education

"Altai State Medical University"

Federal Agency for Health and Social Development

Department of Economics and Management

Test

in the discipline "Medical Statistics"

on the topic: “Stages of statistical research”

Completed

Checked:

Barnaul - 2009

Introduction……………………………………………………………………………….3

1.1 Statistical observation………………………………………………………......5

1.1.1 Classification of statistical observation according to various criteria………………………………………………………………………………………7

1.1.2 Program and methodological issues of statistical observation…………………………………………………………………………………......12

2 Summary and grouping of statistical observation materials. The concept of a statistical summary, its objectives and content……………………..15

3 Rational forms of presentation of statistical material………….18

3.1 Statistical table and its elements………………...………………18

3.2 Graphic method for studying commercial activities…….....19

4 Solution of the problem……………………………………………………….20

Conclusion………………………………………………………………………………….21

List of used literature………………………………………………………...……22

Introduction

Sanitary (medical) statistics studies issues related to medicine, hygiene, and healthcare. It is an important part of social hygiene and health care organization and at the same time constitutes one of the branches of statistics.

There are three main sections in sanitary statistics: population health statistics, health statistics and clinical statistics.

Objectives of sanitary statistics:

identifying the characteristics of the population’s health and the factors that determine it;

study of data on the network, activities and personnel of health care facilities, as well as data on the results of treatment and recreational activities;

application of sanitary statistics methods in experimental, clinical, hygienic and laboratory research.

The materials of sanitary statistics are aimed at finding ways to improve the health of the population and improve the healthcare system.

Health statistics are used to:

1). Currently, the development of in-depth medical-biological, physical and other research methods, the introduction of new diagnostic technology leads to the accumulation of numerical data characterizing the state of the body and environment. Taking into account the amount of information about the body, one can understand the need to synthesize data using statistical methods;

2). Determination of sanitary and hygienic standards, calculation of doses of medications, determination of standards physical development, assessing the effectiveness of the applied methods of prevention and treatment.

Accounting and evaluation indicators reflect the volume or level of the phenomenon being studied; analytical indicators are used to characterize the development features of a phenomenon, its prevalence in space, the relationship of its parts, and the relationship with other phenomena.

Statistical methodology is a collection general rules(principles) and special techniques and methods of statistical research. The general rules of statistical research are based on the provisions of socio-economic theory and the principle of the dialectical method of cognition. They form the theoretical basis of statistics. Based on a theoretical basis, statistics applies specific methods digital coverage of phenomena that find expression in three stages (stages) of statistical research:

1. Mass scientifically organized observation, with the help of which primary information is obtained about individual units (factors) of the phenomenon being studied.

2. Grouping and summary of material, which represents the division of the entire mass of cases (units) into homogeneous groups and subgroups, calculating the results for each group and subgroup and recording the results in the form of a statistical table.

3. Processing of statistical indicators obtained during the summary and analysis of the results to obtain substantiated conclusions about the state of the phenomenon being studied and the patterns of its development. This is the concept of science - Statistics. The subject of statistics, as a science, is the study of the quantitative side of mass social phenomena in inextricable connection with their qualitative characteristics. From this definition, three main features of statistics can be identified:

1. the quantitative side of phenomena is explored;

2. mass social phenomena are studied;

3. given quantitative characteristic mass phenomena based on the study of qualitative parameters.

Statistics involves the use of a set of dialectical methods of cognition. In the process of statistical research, special methods are used that are invented to better represent statistical populations.

A statistical population is a mass of units united by a single qualitative basis, but differing from each other in a number of varying characteristics. Variation (change) of characteristics (usually quantitative) can occur in time, in space, in the mutual change of one characteristic from another. For example, the size of a worker’s salary depends on the amount of products he produces.

1.1 Statistical observation

Statistical observation is the systematic, scientifically based collection of data or information about socio-economic phenomena and processes. Statistical observation – initial stage economic and statistical research.

1) Statistical observation must meet the following requirements: the observed phenomena must have a scientific or practical value, express certain socio-economic types of phenomena;

2) direct collection of mass data should ensure the completeness of facts related to this issue, since phenomena are in constant change and development. If complete data is missing, the analysis and conclusions may be misleading;

3) to ensure the reliability of statistical data, a thorough and comprehensive check of the quality of the collected facts is necessary, which is one of the most important characteristics statistical observation;

4) scientific organization of statistical observation is necessary in order to create best conditions to obtain objective materials.

The tasks facing the manager determine the purpose of observation. The general purpose of statistical surveillance is to provide information support for management. the goal determines the object of statistical observation - the totality of phenomena and objects covered by observation. The object of observation consists of certain units. The unit of the totality can be a person, a fact, an object, a process, etc. The observation unit is the primary element of the object of statistical observation. This element is the carrier of signs recorded during observation. The unit of observation is the element of the population from which the necessary data is collected. The choice of object and units of observation depends on specific conditions. Observation units have many different characteristics. Correctness, which manifests itself not in an individual phenomenon, but in a mass of homogeneous phenomena, when generalizing the data of a statistical population, is called a statistical pattern. For the study of statistical regularity, the law of large numbers is of fundamental importance. In a large number of observations, random multidirectional deviations cancel each other out. During the observation process, the most significant or interrelated features are used to record data. Clarity in defining the unit of observation makes it possible to reasonably determine the recorded signs of observation with a minimum number of signs related to the problem or phenomenon being studied. The clarification and formation of characteristics of an observation unit is carried out on the basis of the following general rules: These general approaches to determining the characteristics of an observation unit are complemented by specific features of the processes being studied.

An observation unit should not be confused with a reporting unit. A reporting unit is one from which reporting data is conditionally received in accordance with approved forms. If observation is carried out by reporting, then the reporting unit may generally be the same as the observation unit. The reporting unit is also called the reporting unit. It may not coincide with the unit of observation.

After defining the object, the researcher must highlight the boundaries that define the population or phenomenon being studied. To limit an object, specific values ​​or limits of characteristics are established. Such quantitative restrictions on characteristics are called qualifications. This is a series of characteristics, the quantitative value of which, when conducting statistical observation, serves as the basis for taking into account (or not classifying) a unit in the population being studied.

An observation point or period is the time for which data is recorded. The moment of observation is established in accordance with the purpose and characteristics of the phenomenon. In practice, it is also called the critical moment. Some phenomena and processes have seasonal or other cyclical components.

1.1.1 Classification of statistical observation according to various criteria

Statistical observation is divided into:

1) according to the type of observations into 2 groups:

According to the coverage of units of the population into continuous and non-continuous;

2.1 Statistical study design

Statistical data analysis systems are a modern, effective tool for statistical research. Special statistical analysis systems, as well as universal tools - Excel, Matlab, Mathcad, etc., have ample opportunities for processing statistical data.

But even the most advanced tool cannot replace the researcher, who must formulate the purpose of the study, collect data, select methods, approaches, models and tools for processing and analyzing data, and interpret the results obtained.

Figure 2.1 shows a diagram of the statistical study.

Fig. 2.1 - Schematic diagram of statistical research

The starting point of statistical research is the formulation of the problem. When determining it, the purpose of the study is taken into account, what information is needed and how it will be used when making a decision is determined.

The statistical study itself begins with the preparatory stage. During the preparatory stage, analysts study technical task– a document drawn up by the customer of the study. The terms of reference must clearly state the objectives of the research:

    the object of research is determined;

    assumptions and hypotheses are listed that must be confirmed or refuted during the study;

    describes how the research results will be used;

    the time frame within which the study must be carried out and the budget for the study.

Based on the technical specifications, it is developed analytical report structure- That, in any form the results of the study must be presented, as well as statistical observation program. The program is a list of signs that must be recorded during the observation process (or questions to which reliable answers must be obtained for each surveyed observation unit). The content of the program is determined both by the characteristics of the observed object and the objectives of the study, as well as by the methods chosen by analysts for further processing of the collected information.

The main stage of statistical research includes the collection of necessary data and their analysis.

The final stage of the research is drawing up an analytical report and submitting it to the customer.

In Fig. Figure 2.2 presents a diagram of statistical data analysis.

Fig.2.2 – Main stages of statistical analysis

2.2 Collection of statistical information

Collecting materials involves analyzing the technical specifications of the study, identifying sources of necessary information and (if necessary) developing questionnaires. When researching sources of information, all required data is divided into primary(data that is not available and must be collected directly for this study), and secondary(previously collected for other purposes).

Secondary data collection is often referred to as “desk” or “library” research.

Examples of collecting primary data: observing store visitors, surveying hospital patients, discussing a problem at a meeting.

Secondary data is divided into internal and external.

Examples of internal secondary data sources:

    information system of the organization (including an accounting subsystem, a sales management subsystem, CRM (CRM system, short for Customer Relationship Management) - application software for organizations designed to automate strategies for interacting with customers) and others);

    previous studies;

    written reports from employees.

Examples of external secondary data sources:

    reports from statistical bodies and other government agencies;

    reports from marketing agencies, professional associations, etc.;

    electronic databases (address directories, GIS, etc.);

    libraries;

    mass media.

The main outputs at the data collection stage are:

    planned sample size;

    sample structure (presence and size of quotas);

    type of statistical observation (data collection, survey, questionnaire, measurement, experiment, examination, etc.);

    information about survey parameters (for example, the possibility of falsification of questionnaires);

    scheme for encoding variables in the database of the program selected for processing;

    data conversion plan;

    plan diagram of the statistical procedures used.

This same stage includes the survey procedure itself. Of course, questionnaires are developed only to obtain primary information.

The received data must be edited and prepared accordingly. Each questionnaire or observation form is checked and, if necessary, adjusted. Each answer is assigned numeric or letter codes - the information is encoded. Data preparation includes editing, transcribing and checking data, coding and necessary transformations.

2.3 Determination of sample characteristics

As a rule, data collected as a result of statistical observation for statistical analysis is a sample population. The sequence of data transformation into the process of statistical research can be schematically represented as follows (Fig. 2.3)

Fig 2.3 Statistical data conversion scheme

By analyzing a sample, it is possible to draw conclusions about the population represented by the sample.

Final determination of general sampling parameters produced when all the questionnaires have been collected. It includes:

    determining the actual number of respondents,

    determination of the sampling structure,

    distribution by survey location,

    establishing a confidence level for the statistical reliability of the sample,

    calculation of statistical error and determination of representativeness of the sample.

Actual quantity respondents may turn out to be more or less than planned. The first option is better for analysis, but is disadvantageous for the customer of the research. The second may have a negative impact on the quality of the research, and, therefore, is not beneficial for either analysts or customers.

Sampling structure may be random or non-random (respondents were selected based on a previously known criterion, for example, by the quota method). Random samples are a priori representative. Non-random samples may be intentionally unrepresentative of the population but provide important information for research. In this case, you should also carefully consider the filtering questions of the questionnaire, which are designed specifically to filter out respondents who do not meet the requirements.

For determining the accuracy of the assessment First of all, it is necessary to set the confidence level (95% or 99%). Then the maximum statistical error sample is calculated as

or
,

Where - sample size, - the probability of the occurrence of the event under study (the respondent being included in the sample), - the probability of the opposite event (the respondent not being included in the sample), - confidence coefficient,
- variance of the characteristic.

Table 2.4 shows the most commonly used values ​​of confidence probability and confidence coefficients.

Table 2.4

2.5 Data processing on a computer

Analyzing data using a computer involves performing a number of necessary steps.

1. Determination of the structure of the source data.

2. Entering data into the computer in accordance with its structure and program requirements. Editing and converting data.

3. Specifying a data processing method in accordance with the objectives of the study.

4. Obtaining the result of data processing. Editing it and saving it in the required format.

5. Interpretation of the processing result.

No computer program can perform steps 1 (preparatory) and 5 (final) - the researcher does them himself. Steps 2-4 are performed by the researcher using the program, but it is the researcher who determines the necessary procedures for editing and transforming data, methods of data processing, as well as the format for presenting the processing results. The computer's help (steps 2–4) ultimately involves moving from a long sequence of numbers to a more compact one. At the “input” of the computer, the researcher submits an array of initial data that is inaccessible to comprehension, but suitable for computer processing (step 2). Then the researcher gives the program a command to process the data in accordance with the task and data structure (step 3). At the “output”, he receives the result of processing (step 4) - also an array of data, only smaller, accessible to comprehension and meaningful interpretation. At the same time, an exhaustive analysis of data usually requires repeated processing using different methods.

2.6 Selecting a data analysis strategy

The choice of strategy for analyzing the collected data is based on knowledge of the theoretical and practical aspects of the subject area under study, the specifics and known characteristics of the information, the properties of specific statistical methods, as well as the experience and views of the researcher.

It must be remembered that data analysis is not final goal research. Its goal is to obtain information that will help solve a specific problem and make adequate management decisions. The choice of analysis strategy should begin with an examination of the results of the previous stages of the process: defining the problem and developing a research plan. A preliminary data analysis plan developed as one element of a research plan is used as a “draft”. Then, as additional information becomes available at later stages of the research process, certain changes may need to be made.

Statistical methods are divided into one- and multivariate. Univariate methods are used when all elements of the sample are assessed by one indicator, or if there are several of these indicators for each element, but each variable is analyzed separately from all the others.

Multivariate techniques are excellent for data analysis when two or more measures are used to evaluate each sample element and these variables are analyzed simultaneously. Such methods are used to determine dependencies between phenomena.

Multivariate methods differ from univariate methods primarily in that when they are used, the focus of attention shifts from the levels (averages) and distributions (variances) of phenomena and focuses on the degree of relationship (correlation or covariance) between these phenomena.

Univariate methods can be classified based on whether the data being analyzed is metric or non-metric (Figure 3). Metric data is measured on an interval scale or a relative scale. Nonmetric data is assessed on a nominal or ordinal scale

Additionally, these methods are divided into classes based on how many samples—one, two, or more—are analyzed in the study.

The classification of one-dimensional statistical methods is presented in Fig. 2.4.

Rice. 2.4 Classification of univariate statistical methods depending on the analyzed data

The number of samples is determined by how the data is handled for a particular analysis, not by how the data was collected. For example, data on males and females can be obtained within the same sample, but if the analysis is aimed at identifying differences in perception based on gender differences, the researcher will have to operate with two different samples. Samples are considered independent if they are not experimentally related to each other. Measurements taken in one sample do not affect the values ​​of variables in another. For analysis, data from different groups of respondents, such as those collected from females and males, are usually treated as independent samples.

On the other hand, if the data from two samples refer to the same group of respondents, the samples are considered paired - dependent.

If there is only one sample of metric data, z-test and t-test can be used. If there are two or more independent samples, in the first case you can use the z- and t-test for two samples, in the second - the method of one-way analysis of variance. For two related samples, a paired t-test is used. If we're talking about For non-metric data from a single sample, the researcher can use the frequency distribution test, chi-square test, Kolmogorov-Smirnov test (K~S), run test, and binomial test. For two independent samples with non-metric data, you can resort to the following methods of analysis: chi-square, Mann-Whitney, medians, K-S, one-way analysis of variance Kruskal-Wallis (ANOVA). In contrast, if there are two or more related samples, the sign, McNemar, and Wilcoxon tests should be used.

Multivariate statistical methods are aimed at identifying existing patterns: interdependence of variables, relationship or sequence of events, inter-object similarity.

Quite conventionally, we can distinguish five standard types of patterns, the study of which is of significant interest: association, sequence, classification, clustering and forecasting

An association occurs when several events are related to each other. For example, a study conducted in a supermarket may show that 65% of those who buy corn chips also buy Coca-Cola, and if there is a discount for such a set, they buy Coke in 85% of cases. Having information about such an association, it is easy for managers to assess how effective the discount provided is.

If there is a chain of events related in time, then we talk about a sequence. For example, after buying a house, in 45% of cases, a new kitchen stove is purchased within a month, and within two weeks, 60% of new residents acquire a refrigerator.

With the help of classification, signs are identified that characterize the group to which a particular object belongs. This is done by analyzing already classified objects and formulating some set of rules.

Clustering differs from classification in that the groups themselves are not predefined. Using clustering, various homogeneous groups of data are identified.

The basis for all kinds of forecasting systems is historical information stored in the form of time series. If it is possible to construct patterns that adequately reflect the dynamics of the behavior of target indicators, there is a possibility that with their help it is possible to predict the behavior of the system in the future.

Multivariate statistical methods can be divided into relationship analysis methods and classification analysis (Fig. 2.5).

Fig. 2.5 – Classification of multivariate statistical methods

Processing of collected primary data, including their grouping, generalization and presentation in tables, constitutes the second stage of statistical research, which is called summary.

There are 3 main forms of presenting processed statistical data: text, tabular and graphic.

At the third stage of the statistical study, based on the final data of the summary, scientific analysis of the phenomena under study: various generalizing indicators are calculated in the form of average and relative values, certain patterns in distributions, dynamics of indicators, etc. are identified. Based on the identified patterns, forecasts are made for the future.

Statistical observation is the first stage of statistical research. Almost always, in accordance, of course, with the goals and objectives of research, work begins with taking into account facts and collecting primary material. Primary material is the foundation of statistical research. The success of the entire study as a whole depends on the quality of statistical observation. It must be organized in such a way that the result is objective, accurate data about the phenomenon being studied. Incomplete, inaccurate data that does not sufficiently characterize the process, especially if it distorts it, leads to errors. And an analysis carried out on such a basis will be erroneous. It follows that the recording of facts and the collection of primary material must be carefully thought out and organized.

It should be noted once again that statistical observations are always massive. The law of large numbers comes into force - the larger the population, the more objective the results obtained will be.

Statistical observation can be divided into three stages: 1. Preparation of observation. This is the formulation of the observation program, the definition of indicators grouped into layouts of the final statistical tables.

The questions that make up the content of the program should follow from the purpose of the study or the hypothesis to which the study is supposed to be devoted to confirming. An important element is the layout of the final statistical tables. They are the project for developing observation results, and only if they are available it is possible to identify all the issues that need to be included in the program and avoid including unnecessary information.

2. Direct collection of material. This is the most labor-intensive stage of the study. Statistical reporting, How special shape organization of data collection is inherent only state statistics. All other information is collected through a variety of static tools. It is necessary to point out two main requirements for the collected data: reliability and comparability. And what is extremely desirable (in market conditions it increases many times) is timeliness.



3. Control of the material before its analysis. No matter how carefully the observation tools are compiled and the instructions given to the performers, the observation materials always need to be monitored. This is explained by the massive nature of statistical work and the complexity of their content.

The object of any statistical study is a set of units of the phenomenon being studied. The object can be the population during the census, enterprises, cities, company personnel, etc. In short, the object of observation is the statistical population under study. It is very important to define the boundaries of the population being studied, which clearly define the population being studied. For example, if the goal is to study the activities of small enterprises in the region, then it is necessary to determine what form of ownership it belongs to (state, private, joint, etc.), by what criterion the enterprises will be selected: industry characteristics, sales volume, time since registration, status (active, inactive, temporarily idle), etc. The population must be homogeneous, otherwise additional difficulties will arise in the analysis process and errors are almost always inevitable.

Along with defining the object of observation and the boundaries, it is important to define the unit of population and the unit of observation. A population unit is an individual component element of a statistical population. An observation unit is a phenomenon, an object, the characteristics of which are subject to registration. The set of observation units constitutes the object of observation. For example, the goal: to study the influence of various factors on the productivity of workers at the mines of Ispat-Karmet OJSC. In this case, the population is determined by the goal itself - miners working at the Ispat-Karmet mines, the unit of the population is the miner, as a carrier of information, and the unit of observation is the mine. Briefly: the unit of the population is what is being examined, the unit of observation is the source of information.
To carry out statistical observation, it is necessary to collect data on a given basis, namely: to designate a statistical population that consists of materially existing objects, the unit and purpose of a one-time survey of the object, and to draw up a statistical observation program.



At the first stage, it is formed sample collected data according to the indicated characteristics, the data is ordered in ascending order. Then you should draw up a table of frequency distributions and sequentially fill in the corresponding columns in the table.

At the second stage, in order to process the collected primary data, it is necessary to group and generalize the selected elements according to a given characteristic, and identify the numerical characteristics of the sample. This stage of statistical research is called summary. Summary – scientific processing of primary data in order to obtain generalized characteristics of the phenomenon being studied according to a number of characteristics essential to it, i.e. virgin materials are brought together to form statistical aggregates, which are characterized by final absolute generalizing indicators. At the summary stage, we move from the characterization of individual varying characteristics of units of the population - to the characteristics of the entire population as a whole or to the characteristics of their general manifestation in the mass.

Should be found scope according to the formula:

R=x(max) – x(min);

fashion M(0), which shows the value that occurs most often, median M(e), which characterizes the average value (it does not exceed half of the members of the series), corresponds to the option in the middle of the ranked variation series. The position of the median is determined by its number: Nme = (n+1) /2, where n is the number of units in the aggregate and arithmetic mean for the designated group, which is calculated by the formula:

The results of the work can be presented graphically in the form of a histogram and frequency distribution polygon.

The data obtained reflect what is common to all units of the population under study. As a result of statistical observation, an objective, comparable, full information, allowing at subsequent stages of research to provide scientifically based conclusions about the nature and patterns of development of the phenomenon being studied.

Practical task

Conduct a statistical study to find out information about growth 2 5 randomly selected students of Tomsk Polytechnic University.

Make a frequency distribution table, find the range, mode, median and arithmetic mean value of height (in cm) for the designated young men.

Statistical research (SI) allows you to get an idea of ​​a particular phenomenon, study its size, level, and identify patterns. The subject of SI can be population health, organization medical care, environmental factors affecting health, etc.

When conducting SI, they can be used 2 methodological approaches:

1) studying the intensity of the phenomenon in the environment, the prevalence of the phenomenon, identifying trends in the health of the population - are carried out on general populations or sample populations sufficiently large in number, making it possible to obtain intensive indicators and reasonably transfer the obtained data to the entire general population

2) conducting strictly planned studies to study individual factors without identifying the intensity of the phenomenon in the environment - carried out, as a rule, on small populations in order to identify new factors, study unknown or little-known cause-and-effect relationships

Stages of statistical research:

Stage 1. Drawing up a research plan and program– is preparatory, during which the purpose and objectives of the research are determined, a research plan and program is drawn up, a program for summarizing statistical material is developed, and organizational issues are resolved.

A) the purpose and objectives of the study must be clearly formulated; the goal determines the main direction of the research and, as a rule, is not only theoretical, but also practical in nature, it is formulated clearly, clearly, unambiguously; To reveal the set goal, research objectives are determined.

B) it is necessary to study literature on this topic.

B) needs to be developed Organizational plan – provides for the determination of 1) place (administrative and territorial boundaries of observation), 2) time (specific terms of observation, development and analysis of material) and 3) subject of research (organizers, performers, methodological and organizational management, sources of research funding).

D) development Research plan – includes the definition:

– object of study (statistical population);

– volume of research (continuous, non-continuous);

– types (current, one-time);

– methods of collecting statistical information.

D) it is necessary to compile Research (observation) program – includes:

– definition of the observation unit;

– list of questions (accounting characteristics) to be registered in relation to each observation unit

– development of an individual accounting (registration) form with a list of questions and characteristics to be taken into account;

– development of table layouts, into which the research results are then entered.

A separate form is filled out for each observation unit; it contains the passport part, clearly formulated program questions posed in a certain sequence and the date of filling out the document. Medical registration forms used in the practice of treatment and preventive institutions can be used as registration forms.

Sources for obtaining information can be other medical documents (medical histories, and individual outpatient records, child development histories, birth histories), reporting forms from medical institutions, etc.

To ensure the possibility of statistical development of data from these documents, information is copied onto specially designed accounting forms, the content of which is determined in each individual case in accordance with the objectives of the study.

Currently, in connection with machine processing of observation results using a computer, program questions can be formalized , When questions in an accounting document are presented in the form of an alternative (yes, no) , Or ready-made answers are offered, from which a specific answer must be selected.

E) it is necessary to draw up a program for summarizing the obtained data, which includes establishing grouping principles and identifying grouping characteristics , Determining combinations of these characteristics, drawing up layouts of statistical tables.

Stage 2. Collection of material (statistical observation)– – consists of registering individual cases of the phenomenon being studied and the accounting features that characterize them on registration forms. Before and during this work, the surveillance performers are instructed (oral or written) and provided with registration forms.

Statistical observation can be:

A ) by time:

1) Current– the phenomenon is studied for a specific period of time (week, quarter , Year, etc.) by daily recording the phenomenon as each case occurs (counting the number of births , Dead, sick , Discharged from hospital). This takes into account rapidly changing phenomena.

2) One-time– statistical data is collected at a certain (critical) point in time (population census, study of the physical development of children, preventive examinations of the population). A one-time registration reflects the state of the phenomenon at the time of study and is used to study slowly changing phenomena.

The choice of the type of observation over time is determined by the purpose and objectives of the study (characteristics of hospitalized patients can be obtained as a result of the current registration of those leaving the hospital - current observation or by a one-day census of patients in the hospital - one-time observation).

B) depending on the completeness of coverage of the phenomenon being studied:

1) Solid– all observation units included in the population are studied, i.e. the general population. They are carried out in order to establish the absolute size of the phenomenon (total population, total number of births or deaths). It is also used in cases where information is necessary for operational work (taking into account infectious diseases, doctors’ workload, etc.)

2) Not continuous– only part of the general population is studied, divided into several types:

1. Monographic method– gives a detailed description of individual units of the population that are characteristic in some respect and a deep, comprehensive description of objects.

2. Main Array Method– involves the study of those objects in which a significant majority of observation units are concentrated. The disadvantage of this method is that a part of the population remains uncovered by the study, although small in size, but which may differ significantly from the main array.

3. Questionnaire method is the collection of statistical data using specially designed questionnaires addressed to a specific circle of people. This study is based on the principle of voluntariness, therefore the return of questionnaires is often incomplete. Often the answers to the questions posed bear the imprint of subjectivity and randomness. This method is used to obtain an approximate characteristic of the phenomenon being studied.

4. Sampling method- the most common method, comes down to the study of some specially selected part of observation units to characterize the entire population. The advantage of this method is that it produces results with a high degree of reliability, as well as a significantly lower cost. The study involved fewer performers , In addition, it requires less time. In medical statistics, the role and place of the sampling method is especially great, since medical workers They usually deal only with part of the phenomenon being studied (they study a group of patients with a particular disease, analyze the work of individual departments).

C) by the method of obtaining information during the process and the nature of its implementation

1. Direct observation(clinical examination of patients , Conducting laboratory , Instrumental Research , Anthropometric measurements, etc.)

2. Sociological methods : interview method (face-to-face survey), questionnaire (correspondence survey - anonymous or non-anonymous), etc.;

3. Documentary research(copying information from medical records and reports, information from official statistics of institutions and organizations.)

Stage 3. Material development, statistical grouping and summary– begins with checking and clarifying the number of observations , Completeness and correctness of the information received , Identifying and eliminating errors, duplicate records, etc.

For proper development of the material, it is used Encryption of primary accounting documents, That is, the designation of each feature and its group with a sign - alphabetic or digital. Encryption is a technique , Facilitates and accelerates material development , Increasing quality and precision of development. Ciphers - symbols - are generated arbitrarily. When encoding diagnoses, it is recommended to use the international nomenclature and classification of diseases; when encoding professions - with a dictionary of professions.

The advantage of encryption is that, if necessary, after completing the main development, you can return to the development material in order to clarify new connections and dependencies. Encrypted accounting material makes this easier and faster , Than unencrypted. After verification, the characteristics are grouped.

Grouping – division of the totality of data being studied into homogeneous ones , Typical groups based on the most significant characteristics. Grouping can be carried out according to qualitative and quantitative criteria. The choice of grouping characteristic depends on the nature of the population being studied and the objectives of the study.

A) Typological grouping produced according to qualitative (descriptive, attributive) characteristics (gender , Profession, disease groups)

B) Variational grouping(by quantitative characteristics) is carried out on the basis of the numerical dimensions of the characteristic (age , Duration of the disease, duration of treatment, etc.). Quantitative grouping requires resolving the issue of the size of the grouping interval: the interval can be equal, but in some cases it can be unequal, even include the so-called open groups(when grouped by age, open groups can be defined: up to 1 year, 50 years and older).

When determining the number of groups, they proceed from the purpose and objectives of the study. It is necessary that groups can reveal the patterns of the phenomenon being studied. Big number groups can lead to excessive fragmentation of the material and unnecessary detailing. A small number of groups leads to a blurring of characteristic features.

Having finished grouping the material, proceed to Summary– generalization of individual cases , Obtained as a result of statistical research, into certain groups, counting them and entering them into table layouts.

A summary of statistical material is carried out using statistical tables. Table , Not filled with numbers , Called Layout.

Statistical tables can be lists , Chronological, territorial.

The table has a subject and a predicate. The statistical subject is usually placed along horizontal lines on the left side of the table and reflects the main, main feature. The statistical predicate is placed from left to right along vertical columns and reflects additional accounting characteristics.

Statistical tables are divided into:

A) Simple– presents the numerical distribution of material according to one characteristic , Its components. Simple table usually contains a simple list or summary of the entire phenomenon being studied.

B) Group– a combination of two characteristics is presented in connection with each other

IN) Combinative– the distribution of material is given according to three or more interrelated characteristics

When compiling tables, certain requirements must be met:

– each table must have a title reflecting its contents;

– inside the table, all columns must also have clear, short names;

– when filling out the table, all cells of the table must contain the corresponding numerical data. Cells in the table that are left blank due to the absence of this combination are crossed out (“-”), and if there is no information in the cell, “n.s.” or "…";

– after filling out the table, the vertical columns and horizontal rows are summed up in the bottom horizontal row and in the last vertical column on the right.

– tables must have a single sequential numbering.

In studies with a small number of observations, summaries are performed manually. All accounting documents are divided into groups in accordance with the attribute code. Next, the data is calculated and recorded in the appropriate cell of the table. Currently, computers are widely used in sorting and summarizing material. . Which allow not only to sort the material according to the characteristics being studied , But perform calculations of indicators.

Stage 4. Statistical analysis of the phenomenon under study, formulation of conclusions– a critical stage of the study, at which the calculation of statistical indicators (frequency , Structures , Average sizes of the phenomenon being studied), their graphic representation is given , Dynamics is being studied , Trends, connections between phenomena are established . Forecasts are made, etc. Analysis involves interpreting the data obtained and assessing the reliability of the research results. Finally, conclusions are drawn.

Stage 5. Literary processing and presentation of the results obtained– is final and involves finalization of the results of the statistical study. The results can be presented in the form of an article, report, report , Dissertations, etc. For each type of registration there are certain requirements , Which must be observed during literary processing of the results of statistical research.

The results of medical and statistical research are introduced into healthcare practice. Possible various options use of research results: familiarization with the results to a wide audience of medical and scientific workers; preparation of instructional and methodological documents; preparation of rationalization proposals and others

Upon completion of the statistical study, recommendations and management decisions are developed, the research results are implemented into practice, and effectiveness is assessed.

In conducting a statistical study, the most important element is adherence to a strict sequence in the implementation of these stages.

 


Read:



Dream Interpretation of going blind, why do you dream of going blind in a dream?

Dream Interpretation of going blind, why do you dream of going blind in a dream?

Dream Interpretation "sonnik-enigma" To go blind and see again is a sign of good news and impressions. If in a dream you became blind and regained your sight almost immediately, you...

Online fortune telling whether I will get married 18 years old

Online fortune telling whether I will get married 18 years old

Many girls at least once in their lives think about whether I will ever get married. Various...

Stuffed peppers stewed in a frying pan

Stuffed peppers stewed in a frying pan

Stuffed peppers are prepared very simply and quickly. This dish looks incredibly appetizing, and the vegetable can be filled with absolutely any filling -...

What is personality in psychology, its structure and types

What is personality in psychology, its structure and types

Send your good work in the knowledge base is simple. Use the form below Students, graduate students, young scientists,...

feed-image RSS