We've been asked to cover the polls that are said to be far from day to day, and whether we really believe it, our 'fact' team is verifying this. Today, as the second time, I will focus on the polls conducted this year before the general election.

I will review the previous episode. The trend was analyzed by collecting all three months of polls just before the 2016 general election. It was a way to compare the 'trend' of the 'all polls' that can be analyzed with the party's vote rate in the actual general election, rather than the 'number' of the 'specific poll' that I like.

At that time, what influenced the trend was analyzed by the 'North Korean variable' and the 'nomination conflict' of each party. Please refer to the last episode. ▶ (2020.03.24) [Actually] Can I believe in an overflowing poll?

STEP 1. "I can't see the trend"

Let's take a look at this general election poll in this way. 62 opinion polls published from January 1st to March 25th this year have been thoroughly analyzed for 'support rate' for each political party.
CG: Ahn Hye-min, a data reporter

At the bottom is the Justice Party in yellow and the National Assembly in orange, and the gray above is the shaman. The pink above it is the Future United Party and the blue above it is the Democratic Party. Although specific support rates go all the way in a certain direction, they spread a lot up and down. Sedimentary rock. That means that there is a big difference between polls. It is difficult to grasp any tendency.

In fact, in the last episode, I looked at the trend of opinion polls, and then worked on finding out what happened on the day of the trend change, but this year's survey trend was meaningless because the trend itself was not clear. It may seem. We looked for whether there was any news that really could be called 'variable' or what happened from January to March this year.
CG: Designer Jun-Seok An

There were a lot of big issues during this year's general election. The party nomination conflict, which had been the biggest variable in the last general election, continued this time, but less dramatic situations such as the old Saenuri Party's 'Silver Wave' and the Democratic Party's 'Chinno Cutoff'. However, another conflict was added to the situation with the unprecedented situation of 'proportional party'. In addition, the Democratic Party established a 'proportional coalition party' and confronted a small number of political parties. The future unification party has been proportionally nominated by the future party, the future party, and former president Han Sun-kyo has resigned. Both giant parties suffered from proportional nominations rather than regional nominations.

It is hard to say that there is no 'North Korea variable', such as North Korea firing a cannon and Kim Yeo-jung, the first deputy head of the North Korean Party's Central Committee, to announce the accusation of the Blue House.

However, it would not be an exaggeration to say that all these issues were buried in Corona 19. After the first corona 19 confirmed in Korea on January 20, domestic news was filled with Corona 19. On February 18th, the number 31 patient came out from Daegu, and the first day after the death, the other news disappeared. As the situation went urgently, the nomination conflict was distracted from the public interest, and it seems that the North Korean variables were not even properly addressed. Naturally, interest in general elections has also fallen.

Perhaps Corona19 Jungkook influenced the 'disorder' of the poll.

STEP 2. Per-consolidated support rate

So, isn't this year's general election poll worth analysis? It's not like that.

Look at the graph above. Even if you look at it with your eyes, the future unified party support rate spreads up and down more than other parties. This means that there is a great deal of variation among research institutes.

I need to find out how spread it is. We will use the concept of 'standard deviation' we learned in middle school. Anyone who has taken the University Scholastic Ability Test knows that it is a 'standard score'. This standard score reflects the 'standard deviation'. In each poll, we calculate the standard deviation, a statistical technique that shows how wide it is in the average of the political party's approval rating. It means that the larger the standard deviation, the more values ​​fall off the mean.
The Consolidated Party spreads 5.30% around the average. It is the largest of the political parties. It means a lot. The Democratic Party is 3.18%, the Justice Party is the smallest, at 1.03%.

In other words, it can be said that the current polls do not accurately capture the public sentiment of the future unification party.

What we need to look at further is those who answered that there was no support party, the shaman. The standard deviation of the shamans is 7.80%, which is a jump for every survey. What does this mean? Who are the shamans?

STEP 3. Correlation between the unified party support rate and the shaman

The shamans are people with or without a support party. However, as you've responded to the annoying poll phone calls, you're still a little interested in politics, so you're likely to find a polling place at the election. You can presuppose that there is a high likelihood of taking any party.

We are going to find out which political party's approval rating has a big impact on the size of the shamans in each poll. We are going to look at the correlation between each party's approval rating and the size of the shaman, but this time we are going to use the 'correlation coefficient' value.

Correlation coefficient is an overwhelmingly popular statistical technique for determining the correlation between two indicators. The correlation coefficient values ​​must be between -1 and 1, with a closer to -1 being a stronger negative correlation and a closer to 1 being a stronger positive correlation. If it's close to 0, you can see that it doesn't matter. Usually, it is considered that there is a high correlation between +0.5 and -0.5 or below.

The resulting value was:
The correlation coefficient between the United Party and the shaman was -0.91, which showed a very strong negative correlation. Democrats also had a relatively strong negative correlation at -0.76, but not as much as the United Party. The Justice Party had little correlation.

A correlation coefficient of -0.91 deserves a 'correct inverse' relationship. In other words, the higher the number of shamans, the lower the percentage of consolidated party support. Conversely, the smaller the number of shamans, the higher the percentage of consolidated party support.

What this means is that quite a bit of shamanism is thought to be contemplating the support of the United Party. That's why the unified party's approval rating can move swiftly as the shamans change their minds.

STEP 4. How does the fluctuating poll and evaluation…

So far, the poll analysis has been the answer to the question, "What party do you support?" In the past, it can be assumed that this question converged on the outcome of the actual election party vote rate, but this time the situation is different. It is a special situation in which the proportional parties, in fact, the satellite parties of each party are 'disrupted'. For example, even supporters of the Democratic Party can shoot the 'People's Democratic Party', which is a proportional party, or 'Open the Democratic Party'. There may even be supporters of the Democratic Party, who vote for the 'justice party'.

In other words, "Which party will the proportional representative be selected in this general election?" Or, you have to do a separate poll analysis on the question, "Which party do you prefer to vote in proportion?"

The political reorganization has not been completed, and it is possible to reflect it from the poll in mid-March. This is the result of the analysis of nine polls that investigated the future party of the United Party, the proportional party of the United Party, and the proportional party of the United States, the Civic Party, the Justice Party, the Open Democratic Party, and the National Assembly.
The future Korean party is highly distributed, up and down, and the Citizens 'Party is slightly downward, while the Citizens' Party also tends to be upward. However, there are not many samples to determine the trend. As in the first graph above, I put it in a program that can show the trend as a wave pattern by reflecting the confidence level and the error range, but it came out broken. There is nothing like that. So, I only marked the dots.

However, here, the percentage of votes to be drawn per unit, especially the size of the shaman, fluctuates like the previous trend. The flow of the shaman layer seems important.

As a result, it is more difficult to understand trends and trends in polls than ever before. Either the poll wasn't done properly, or was it a political environment in which public opinion would converge properly? Unfortunately, it looks more like the latter than the former. This is because the current political order is so complex that it is unprecedented. Party representatives are also confused about the name of the party. The voters who need to evaluate over the past four years are confused. I am worried that the evaluation of the 20th National Assembly will be successful.

So far, it was the 2020 general election fact check.

(Data search: Kim Hye-ri, Kim Jung-woo)

P.s. If you subscribe to the newsletter service of the SBS News Data Journalism Team 'Mabujajakchi', 'Mabuja News', you can receive much more edge news. There are a lot of high-quality news ahead of the general election. During this general election, SBS News Fact Check Team 'Actually' will also participate.
▶ Subscribe to Mabu News