電腦科學CSIE – Page 2 – Emily Wa 蛙哇哇

October 17, 2010

Failure [INSTALL_FAILED_INSUFFICIENT_STORAGE]

The storage of the internal memory is enough. So the error was happened due to the permission for writing.

1.First check the

Android -> Settings -> Applications -> Unknown sources to be set

2.make sure the /data/app can be written. It means the mode of the directory is rwxrwxr_x(775).

April 5, 2010

成績組距計算 in Excel

為了方便自己下回複製公式～只好給它寫下來～

題目：共有15個學生, 成績如下, 請以公式算出60分以下、61~70分、71~80分、81~90分以及91~100分各有幾人?

解答：如下圖

註：輸入公式=FREQUENCY(B2:B16,D2:D6)
　　B2:B16是成績範圍
　　D2:D6是組距範圍

*本格所有照片來自於藍色水瓶子裡的藍色深淵*

March 19, 2010

Programmers need to
debug, which sometimes requires identification of points in your program
where a programmer would like to insert code that would help him/her to
debug his/her code efficiently. A simple example might be inserting a
Console.Writeline() call that prints out values or indicates completion
(successful or unsuccessful) of the executed part. However, these lines
can clutter up the code structure and also needs removal of the
debugging code for the release of the entire software.

This overhead is taken care of by specialized methods in C# that
help the programmer debug the code without the need to clean up his/her
debugging code for the release phase. These methods which are used for
debugging are called Conditional methods.
The compiler identifies these marked methods and never includes them in
the release build.

Well, that’s good enough. But, how do I make a method
conditional? .NET provides an attribute System.Diagnostics.ConditionalAttribute
(alias Conditional) to achieve this. Let’s
look at some code now.

Defining the conditional
method

public class MyTracer

{

[Conditional(“DEBUG”)]

public static
void LogThisMessage(string myMessage, int
severity)

{

// Write message to screen
or a file or …

Console.WriteLine(” DEBUG MESSAGE : “ + myMessage

+” SEVERITY LEVEL : “+severity);

}

The Conditional attribute has been applied to the LogThisMessage() method with the DEBUG conditional compilation symbol. This signals the
compiler that the method should be ignored if the conditional symbol DEBUG is not specified.

Using the conditional method

With the Solution
Configurations set to Debug mode if you execute the following
code, you would happily see the output window shown below.

public static void
Main(string[]
args)

{

// Some error occured in my
code

// Log the message

MyTracer.LogThisMessage(“The error 1221 has occured.”,
5);

}

DEBUG on

Now lets see what happens
to our conditional code in the Release mode. To enable Debug or Release
mode you have to look for the Solution Configurations selector in your
Visual Studio IDE which is shown below.

Change the mode to “Release” and you would see that the conditional code
now has disappeared.

We used the predefined DEBUG compilation symbol in our code. You can have
your own defined symbols and use them for conditional compilation. You
can define custom compilation symbols in Project Properties -> Build
tab -> General. Checkboxes have been provided to enable or disable
the DEBUG and TRACE
symbols.

.NET also provides two
classes that provide similar functionality:

System.Diagnostics.Debug
System.Diagnostics.Trace

These classes
contain methods that can also be used for debugging if you do not need
more specific custom methods.

April 3, 2009

Introduction to Data mining 4-3

這小節的內容很多，那我來慢慢看吧！

4-3 Decision Tree Induction 決策樹簡介

哇！這小節好多子小節，有的看了。

4-3-1 How a Decision Tree Works 決策樹如何運作

先來簡單的介紹「樹」，他有三種類型的節點。

Root node : 他沒有incoming edges 而且可能沒有或是很多outgoing edges.
Internal node:確實有一個incoming edge 而且擁有兩個或是更多的outgoing edges.
Leaf or Terminal node: 確實有一個incoming edge並且沒有outgoing edges.

在決策樹中，每一個leaf node都是一個class label。非終端節點(non-terminal node)包含root node以及internal nodes，node中包含了屬性測試條件，是用來分離不同特性的record。

4-3-2 How to Build a Decision Tree 如何建立決策樹

從所給予的屬性集合看來，可以建立的決策樹數目有exponential，然而我們要去計算最佳化的決策樹也是無法做到的，因為會非常耗費記憶體空間。

這些演算法通常會利用greedy strategy去建立一棵決策樹，例如：Hunt’s algorithm就是利用greedy strategy(貪婪法則)，他同時也是很多決策樹歸納演算法的基本演算法，包含ID3, C4.5, 以及CART。

這小節主要就是探討Hunt’s algorithm以及討論演算法設計的問題。

先來介紹一下「Hunt’s algorithm」

在這個演算法中，他是利用把training record分成連續的子集合然後以recursive fashion去建立一個決策樹。

Dt：是一個traning records的集合，連結了node t 以及y={y1,y2,…,yc}，而y是class label (類別標籤)。

Def. of Hunt’s algorithm

Step1: 如果在Dt中的全部record都是屬於同一個類別yt，那t就是會被標成yt類別的leaf node.

Step2: 如果Dt中包含的record屬於兩個以上的類別，那就會選擇一個attribute test condition(屬性測試條件) 去把records分成多個子集合。每一次屬性測試都會產生一個child node(子節點)，而Dt裡的record會依照結果去分散在child node底下。

此演算法是以遞迴方式應用在每一個child node底下。

用下圖例子來說明這個演算法的過程：

右圖就是Training Record，欄位屬性由左到右分別是ID、償還、婚姻狀態、收入，然後分類方法是會不會欺騙(cheat)。

左圖是決策樹建立過程，用比較簡單的考量方式，從第一個屬性開始分，會不會償還有兩條路可以走，Yes的那條最後結果一定是不會欺騙，而No這條可能會也有可能不會欺騙，所以要繼續分下去。接著考慮第二個屬性「婚姻狀態」，如果是已婚的話，結果就是不會欺騙，若是單身或是離婚者就有可能會欺騙，所以這條路還需繼續往下分，最後一個屬性「收入」，如果收入小於80K者，一定不會欺騙，而大於80K者，則會欺騙。

以上就是Hunt’s algorithm建立過程，不過屬性使用的順序有許多種，而這只是其中之一。

如果在training data裡的每一個屬性值組合都可以表示而且每一個組合都有唯一一個class label，那這個演算法就可以正常運作。對於那些實際上的情況而言，這些假設都太嚴謹了，所以我們必須考量額外兩個案例：

對於在Step2所建立的Child node，有一些可能會是空的。
=>the node is declared a leaf node with the same class label as the majority class of training records associated with its parent node. (這個node會被宣告成leaf node，而class label會標成與父節點底下的training records的主要class。）
在Step2中，若Dt中的全部records皆擁有相同的屬性值，則無法在進一步做分離這些records。
=>the node is declared a leaf node with the same class label as the majority class of training records associated with this node. (這個node會被宣告成leaf node，而class label會標成與這個node相連結的training records的主要class一樣。）

Design Issue of Decision Tree Induction 決策樹的設計問題介紹

歸納決策樹的演算法必須注意兩個問題：

How should the training records be split?
建立樹的process的每一個recursive step必須挑選一個attribute test condition 去把records分成小的集合。為了實作此步驟，演算法必須提供一個方法，對於不同屬性形態可以去詳細指明測試條件。
How should the splitting procedure stop?
終止一個建立樹的process是需要一個停止條件的。可能的方法就是一直展開節點直到所有record都屬於同一個類別或者是所有records都有相同的屬性值。雖然這兩種條件都可以有效率的讓建立樹的process暫停，但也有可能有其他的標準可以讓這個prcoess更早暫停。在後面4-4-5將會談到。

呼！好累！看原文書還要瞭解，然後在寫成我自己的筆記，挺累的！

4-3-3 Methods for Expressing Attribute Test Conditions 表達屬性測試條件的方法

這小節比較單純了，主要在介紹表達屬性測試的方法，以及對於不同屬性類型他的相對應結果。

Binary Attributes：對於binary attribute來說，測試條件只會產生兩個可能的結果。例如：體溫：恆溫或是變溫。

Nominal Attributes：nominal attribute可以有多個數值，所以測試條件可以用兩種方式來表示。如下圖，一種是Multiway split，另一種則是Binary split。

Ordinal Attributes：ordinal attribute也可以分成binary split 或是 multiway split，ordinal attribute value可以被分成一群一群，只要不違背attribute value的順序性質就行。如下圖。

Continuous Attributes：對於continous attributes，測試條件可以表示成比較測試(A<v) or (A>=v)，或是範圍詢問，例如vi<=A<vi+1，i=1,..,k，如下圖。

這小小節比較簡單，用圖片表示，就可以很容易理解。

4-3-4 Measures for Selecting the Best Split 選擇最好分割的標準

這小小節是在介紹說如何選擇一個最好的分割方法，我們先定義一些變數，p(i|t)指的是在node t 時，那部分的records屬於class i。

在2-class的問題中，在每一個node，class distribution可以寫成(p0,p1)，而p1=1-p0。

選擇最好分割的標準通常建立在 impurity 子節點上的degree。impurity measures包括了

c是類別個數，在entropy計算中，0log0=0。

下圖可以看到三個數值的比較，觀察之後，你可以看到三個標準的最大值都在uniform（p=0.5)時，而最小值會是在所有records都屬於同一個class時（p=0 or1)。

為了計算測試條件執行的如何，我們需要比較父節點的impurity的degree。差異越大的話，測試條件越佳。

Gain，△，是一個被用來計算分割是否佳的標準。

I(‧)是node的impurity measure，N是位於父節點所有的record數目，N(vj)是與child node vj連結的records數目。決策樹演算法通常會選擇一個測試條件來使gain達到最大值。因為I(parent)對於所有的測試條件都是相等的，所以只要minimize 後面那個

∑ 就可以了。

Splitting of Binary Attributes，

如下圖，如果有兩個方法可以把資料分成較小的集合，若屬性A先選的話，Gini(N1)=0.4898，G(N2)=0.480，對於descendent nodes來說，weighted average of the Gini index＝(7/12)x0.4898+(5/12)x0.480=0.486。相同的，若先選B的話，weighted average of the Gini index=0.375，因為B有比較小的Gini index，所以會先選擇屬性B。

Splitting of Nominal Attributes

binary split的算法跟binary attribute一樣，所以左邊的Gini index=0.468，而右邊Gini index=0.167，所以會選擇右邊的分割方法。另一個Multiway split的算法也是差不多，Gini index=4/20 x 0.375+8/20 x 0 + 8/20 x 0.219=0.163，比起binary split的方法，multiway split的gini index更小。

Splitting of Continuous Attributes

Gain Ratio

在一些少數極端的狀況下，測試條件可能會造成結果不是所想要的，因為每一個區塊的records數目太小以致於不能做有用的預測。有兩個方法可以來解決這樣的問題。第一個就是強制測試條件只能binary splite，另一個就是考慮到屬性測試條件所產生的結果數目，然後去修改分割標準。例如：在C4.5決策樹演算法中，gain ratio被使用來計算分割是否優良。

Gain ratio=△info/Split Info，

k:分割的個數。舉例：如果每一個屬性值都有相同的紀錄數目，那P(vi)=1/k而且splite info會等於log2k，這例子建議如果有一個屬性產生大量的分割數，它的分割資訊也會很大，同時他就會降低gain ratio。

終於結束了！4-3太長了，好累喔！快累死了！不過對於很多東西也清楚了許多。

April 2, 2009

Introduction to Data mining 4-1~4-2

Chapter 4 Classification

何謂「Classification」？

所謂的「Classification」就是把一個未分類的物件分派到已經定義的類別其中之一。這是一個很普遍的問題，包含了各式各樣的應用。

第四章總共分成七小節，重要的部分只有前六小節。

4-1 Preliminaries 初步

Def 4.1 Classification

Classification is the task of learning a target function f that maps each attribute set x to one of the predefined class labels y.

分類就是去學習一個目標函數f，把每一個屬性集合x對應到預先標好的類別y.

Descriptive Modeling

一個Classification model 可以當作辨別的工具用來分辨不同類別的物件。

Predictive Modeling

一個Classification model也可以用來去預測一個未知物件的類別。

分類技術

很適合用來預測或是描述那些可以分成binary或是nomial類別的資料集合，但對於那些有順序(ordinal) 的類別，分類效果就會顯的不彰。為什麼呢？例如：你要把一個人分成高、中以及低收入戶其中之一，你很難去判定那些類別的界線何在。所以這章主要是focus 在binary 或是 nominal class label.

4-1 簡單結束了，休息一下，再進入4-2…..

4-2 General Approach to Solving a Classification Probelm 解決分類問題的一般方法

這小節對於解決分類問題作了一個簡單概要的介紹，分類器就是從輸入資料集合中去建立分類Models的系統方法，例如：決策分類樹(Decision tree classifier)、rule-based classifier、類神經網路(Neural network)、SVM(Support Vector Machine) 以及 naive Bayes classifier。

每一個技術都利用學習演算法去找到一個model最適合屬性集合以及輸入資料的Class label。因此，學習演算法的最重要一點就是要建立一個擁有generalization capability的model，簡單的說，就是這些model可以正確的去預測一個未知物件的class label。

圖片出自於Pang的Introdution to Data mining

上圖可以看到整個流程就是利用學習演算法以及Training Set建立出一個Learn Model，然後再把這些Models應用到Test Set上。

一個分類model 的效能如何評估呢？就是依照測試資料被預測出來的正確個數以及不正確個數來評斷。這些個數一般都是以下圖的形式表示。

Confusion matrix
		Predicted Class
		Class =1	Class=0
Actual Class	Class =1	f11	f10
Actual Class	Class =0	f01	f00

在confusion matrix中，f11代表的就是真實類別為1，而被預測成類別1的個數，f01代表真實類別為1，而被預測成類別2的個數，以此類推。

另外我們也會算Accuracy，

Accuracy=正確預測的個數/全部預測的個數＝(f11+f00)/(f11+f10+f01+f00)

那相對的，我們也會算Error rate，

Error rate=錯誤預測的個數/全部預測的個數=(f10+f01)/(f11+f10+f01+f00)

很多分類演算法都是在追求很高的Accuracy或是很低的Error rate，在4-5我們還會聊到其他的評估方式。

4-2 到此結束。乎！又可以休息了….

March 11, 2009

都是「ASP.NET Development Server」惹的禍

最近在用VS2008開發ASP.NET的程式，前兩天在我努力寫完了一堆邏輯程式碼後，想要Debug 一下看邏輯是否正確時，老天爺開了我一個很大的玩笑，按下F5後，很開心的等畫面出現，結果畫面卻是無法顯示網頁，我整個傻掉，想說別在這時候玩我呀！

當遇到困難時，我都是求救於骨狗大神，發現很多人都有同樣問題，不過找不到一個好的方法，於是，我秉持著實驗精神，把VS2008跟IIS重灌了，結果……

一樣不能跑

而且更慘的是IIS不能跑ASP.NET，千萬記得以後安裝要先IIS再灌.NET，要不然問題真的很多。

要不然就跟我一樣，又重灌了IIS，記得檢查IIS裡面的細項，你所需要的項目是否已經勾選。

這時候雖然直接RUN站台是可以work的，不過用VS2008還是一樣無法顯示網頁，這時候我心血來潮的想到一個方法，就是先按F5後，讓網頁跑出來..接下來

我把「localhost」改成「127.0.0.1」竟然可以跑了，而且還可以debug。

這是怎樣啦！嗚嗚！算了！先救火比較重要，之後再想辦法解決。

March 10, 2009

超好用之移除軟體【Revo Uninstaller Portable】

【Revo Uninstaller Portable】

若你有移不掉、踢不掉、刪不掉的軟體，強烈推薦可以使用這款軟體，他不但會幫你移除，而且可以列出來登錄檔中相關資訊及未刪除的相關目錄及檔案，由使用者去選取作刪除動作。

官方網站

官方下載

阿榮福利味可下載免安裝版

March 9, 2009

ToolkitScriptManager's Error Message in AJAX Toolkit

在開發神啊！系統時，有一個步驟因為要等待某個外部程式執行完畢才可以go through，然後當程式在運作時，我同時會在UI上面擺一個BUSY WAITING的動畫，但是每次都大約兩分鐘不到就停止了。

被這問題困擾多天的我，昨天終於發現是ToolkitScriptManager這元件在作怪，有一個屬性是設定timeout的時間，Default是90s，然怪每次都固定的time slice就停止。另外一點是我都是用FireFox在做測試，昨天覺得每次用FireFox跑「神啊！」都發現他的記憶體越吃越多，但是我初步猜測是廢鐵的Memory Leak很嚴重，因為他是用C＋＋寫的，不過還沒去證實他。

因為記憶體暴漲，因而決定測試的時候改用IE，至少關閉的時候，應該會全部release。就是因為改用IE測試，才發現這個錯誤訊息：

WebForms.PageRequestManagerTimeoutException

因此知道原來是TimeOut，哈！Web經驗還是要多多加強。

最後的解決方法就是把AsyncPostBackTimeout設為0，意思就是無限制。

終於可以在往下一步了。