Social statistics is the use of statistical tools and methods to study different aspects of the society, and human behavior in the social context. It can also be understood as the branch of statistics which studies the living conditions during documented and well-known historical stages of social development. It also provides quantitative information about populations and social groups in the form of census and demographic data which are used for inferential analysis and policymaking. In contrast with economic statistics which is based on economic phenomena and processes, social statistics investigates the political, legal and ideological aspects of people’s lives and the standard of living of a population through cross-sectional studies of populations and social groups. Research in this area is usually concerned with the development of statistical methods that can be applied across social sciences. In this regard, statisticians play a crucial role in every part of social inquiry including design of the study, measurement and evaluation, data linkage, development and selection of statistical models, and assessment.

From the simplest definition, statistics are numbers, probabilities or summarized patterns observed in a population that explain certain characteristics of its members. Statistical analysis is the collection, organization, interpretation and presentation of data about the population. Data for statistical analysis can be collected in various forms including numbers, opinions, images and sounds generated daily in huge quantities through surveys, social media, formal and informal interactions, and organizational databases. Social statisticians analyze and try to make sense of these data using analytical tools in order to understand society and social change. Their objective is to capture people’s attitudes, map behavioral patterns and circumstances, and describe changes in behaviors and populations using quantitative data and hypothesis-driven research.

Social statistics uses a system of indexes to summarize and characterize the living conditions in a society. The indexes describe the organization and class structure of the society, population composition, and the distribution of income among the population groups. There are several other indexes that describe other aspects of the society such as education levels, public health, composition and availability of manpower, housing, social security, work and leisure, community life, and political and moral environment. Social statistics provide the techniques to investigate and test research questions usually based around the impact of policies on different aspects of life.

**Key examples include**:

- Does wealth make people happier?
- Does higher qualification mean higher earning?
- What are the patterns of population growth?
- What measures do people take to face financial hardships?
- Does volunteering increase a person’s sense of wellbeing?

Statistical methods are also used to compare and evaluate data from before and after policy implementation. For example, statistics can be used to measure poverty in a population and then assess the costs and impact of a policy which aims to provide financial support to people living in poverty. Additionally, they can be used to study underlying relationships and patterns in datasets such as:

- People’s responses to multiple survey questions
- Impact of circumstantial aspects of people’s lives such as the unemployment rates of the place they live in on their behavior
- Quality of education in the school/class they study in

Techniques of statistical testing and modeling can be used to for example:

- Predict the results of an election
- Determine general attitude towards the country’s economy
- Determine the patterns in crime

The application of social statistics is often related to the application of statistical methodology in the areas of official statistics, survey methodology, public policy, demography, education, criminology, political science, marketing research, etc. The nature of social sciences is such that statistical data concerning them cannot always be quantified. Also, some of the most important data studied in these areas such as those about addictions or crimes are often too personal, informal or illegal. These complex issues must be considered in order to select the appropriate statistical method for collecting, analyzing and presenting the data.

**Methods of social statistics include:**

**Sampling:** Sampling refers to the techniques of selecting the desired number of individuals from the target population to assess specific population characteristics that need to be studied. Samples (the selected number of individuals) can be collected in two ways—probability-based and non-probability-based. In the probability sampling method, every individual in the population has equal chance of being selected. The probability of selection can be computed as it is always greater than zero. In non-probability sampling on the other hand, some individuals have no chance of being selected which means that the probability of their selection cannot be computed. Selection of the appropriate sampling method is of great importance in social statistics, but it is an area often overlooked by researchers with emphasis usually given to the conclusions of the research and not to the way the sample was collected. The other concern is using a sample size which does not represent the entire population. These two issues have been seen to affect results and raise doubts about the reliability of the research.

**Descriptive Statistics:** Two methods of descriptive statistics, regression analysis and the Analysis of Variance (ANOVA) are used in social statistics to present data and identify relationships, trends and abnormal behavior among the variables without making any stochastic assumptions. The first and classical approach of descriptive statistical analysis is the use of confidence intervals and hypothesis tests. Both regression analysis and ANOVA are parts of the classical approach. Regression is one of the most commonly used methods of analyzing relationships and its main objectives are to determine whether a relationship exists among variables in the first place; determine the strength of the relationship; and to conclude an equation used to describe the relationship. ANOVA is the method of comparing several means using fixed confidence levels at the same time. Data used for this analysis consist of the results of an experiment. It is also widely used in social statistics because of its ability to test explanatory variables and the independence of responses.

**Parametric Methods:** Parametric methods are among the basic statistical methods and are based on the assumption that there is a set of fixed parameters which determine a probability model. These methods are typically used when the statistician knows that the population is approximately normal. Mean and standard deviation are the two parameters for a normal distribution. Common parametric methods are confidence interval for a population mean where the standard deviation is known; confidence interval for a population mean where the standard deviation is not known; and confidence interval for a population variance.

**Nonparametric Methods:** Commonly used nonparametric methods in social statistics include U-test for two independent means, the Spearman correlation test, and the sign test for population mean. These are the methods for which it is not required to make any assumptions of parameters for the population being studied. They do not use a set of fixed parameters or distribution. Thus, these methods are not dependent on the characteristics of the population. These methods are easy to apply, put no constraints on the study, and usually provide more reliable and unbiased results.

**Multivariate Methods:** There are usually more than one variable in real-world problems relating to social sciences. As they have some kind of relationship in most cases, they must be considered together in the statistical method. Methods to analyze data with multiple variables include correspondence analysis, cluster analysis, and factor analysis. Correspondence analysis helps researchers in analyzing multi-way frequency tables and its main objective is to use less dimensions in plotting data to identify their key characteristics. Cluster analysis is the method of classification in which individuals are grouped in such a way that those assigned to a specific group are closer in characteristic to each other than the others. Factor analysis is used in cases where the data cannot be directly quantified or measured such as the human intelligence. It tries to relate observable and unobservable variables using a probability model to draw statistical inferences.

**Categorical Data:** Social science researchers often need to study categorical data in which a categorical variable can assume a certain number of specified discrete values. Such values generally occur in situations where the respondents are assigned to groups or when a property does or does not hold true. Categorical data are commonly used in social sciences to measure opinions and attitudes.

**Time Series:** It is a sequence of observations made in chronological order for a variable being studied. All observations in the time series are considered dependent. Time series data analyzed using methods such as the auto-regressive model and the moving average mode are frequently seen in social sciences for studies concerning population or employment growth, improvements in standards of living, etc.

